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Abstract 


Title  of  Dissertation: 

A  functional  transcriptomic  approach  to  understanding  the  sand  fly  vector  relationships  to 
the  host  and  Leishmania  parasites 

Ryan  C.  Jochim,  Doctor  of  Philosophy,  2008 

Thesis  Directed  by: 

Jesus  G.  Valenzuela,  Ph.D. 

Principal  Investigator,  Vector  Molecular  Biology  Unit,  NIAID,  NIH 

Phlebotomine  sand  flies  are  the  only  known  biological  vectors  of  Leishmania. 

The  maintenance  of  Leishmania  can  be  represented  by  the  epidemiological  triad;  the 
relationship  between  the  vertebrate  host,  the  parasite  and  the  insect  vector.  This  research 
focuses  on  the  relationships  between  the  sand  fly  and  the  host  and  also  between  the  sand 
fly  and  Leishmania.  The  feeding  success  of  the  sand  fly  and  the  transmission  of 
Leishmania  are  linked  to  the  pharmacological  cocktail  of  molecules  in  the  sand  fly  saliva 
-  the  host-sand  fly  relationship.  Adenosine  deaminase  (ADA)  was  identified  as  a 
salivary  constituent  of  Lutzomyia  longipalpis  and  is  not  present  in  the  saliva  of 
Phlebotomus  papatasi,  P.  argentipes,  P.  perniciosus  and  P.  ariasi,  leading  to  the  false 
presumption  that  ADA  is  a  Lutzomyia- specific  enzyme.  Two  transcripts  encoding  for 
ADA  were  identified  by  the  analysis  of  cDNA  libraries  produced  from  salivary  glands  of 
P.  duboscqi.  Our  research  revealed  the  presence  of  ADA  activity  in  the  saliva  of  P. 


iii 


duboscqi  and  demonstrated  that  this  activity  was  attributable  to  the  identified  transcripts. 
Our  expert  use  of  functional  transcriptomics  was  expanded  from  sand  fly  saliva  to 
include  sand  fly  midgut  tissue.  The  successful  completion  of  the  Leishmania  life  cycle, 
from  the  ingested  amastigote  to  the  transmitted  metacyclic  promastigote,  occurs  within 
the  sand  fly  midgut  -  the  sand  fly  -Leishmania  relationship.  We  randomly  sequenced  a 
large  number  of  transcripts  from  high-quality,  full-length,  female  P.  papatasi  and  Lu. 
longipalpis  midgut-specific  cDNA  libraries.  By  means  of  customized  bioinformatics 
analysis,  cDNA  libraries  were  evaluated  from  sugar-fed  and  blood-fed  P.  papatasi  and 
Lu.  longipalpis,  in  the  presence  or  absence  of  L.  major  or  L.  infantum  chagasi, 
respectively.  Subsequently,  we  evaluated  cDNA  libraries  generated  from  L.  infantum 
chagasi- infected  and  uninfected  Lu.  longipalpis  midguts  after  blood  meal  digestion. 

Sand  fly  midgut  transcripts  modulated  by  blood-feeding  or  Leishmania  colonization  were 
identified  using  a  functional  transcriptomic  approach.  By  cataloging  the  midgut  tissue 
molecular  repertoire,  we  identified  novel  sand  fly  midgut-derived  molecules,  including 
proteases,  microvillar  proteins,  peritrophins,  oxidative  stress  proteins  and  antimicrobials. 
Expression  profiling  of  selected  transcripts  from  Lu.  longipalpis  described  the  temporal 
midgut  transcript  abundance  and  verified  the  striking  effect  of  L.  infantum  chagasi 
colonization  on  microvillar  protein,  protease  and  peritrophin  transcription. 
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Epidemiology 

Leishmaniasis  is  a  spectrum  of  diseases  that  are  caused  by  the  protozoan  parasite 
belonging  to  the  genus  Leishmania  and  are  transmitted  by  various  species  of  sand  flies. 
Specific  species  of  the  parasite  commonly  are  associated  with  different  clinical 
manifestations.  Leishmaniasis  afflicts  88  countries  and  affects  approximately  12  million 
people  with  two  million  new  cases  each  year;  thus,  leishmaniasis  is  becoming  a 
worldwide  re-emerging  public-health  problem  [1].  Current  epidemiological  assessments 
report  500,000  cases  and  80,000  deaths  annually  due  to  visceral  leishmaniasis  [1]. 
Although  globally  distributed,  most  all  of  the  cases  of  visceral  leishmaniasis  occur  in 
Bangladesh,  India,  Nepal,  Sudan  and  Brazil.  Cases  of  cutaneous  leishmaniasis  occur  in 
Afghanistan,  Algeria,  Brazil,  Iran,  Peru,  Saudi  Arabia  and  Syria  (Figure  1)  [2-4]. 

Clinical  aspects  of  leishmaniasis 

The  clinical  forms  of  the  disease  include  visceral,  cutaneous,  mucocutaneous,  and 
diffuse  cutaneous  leishmaniasis. 

Visceral  disease  occurs  after  the  parasite  infects  cells  of  the  reticuloendothelial 
system  and  there  is  suppression  of  specific  cell-mediated  immunity.  Multiplying 
unchecked  by  the  immune  system,  both  the  parasites-and  infected  cells  replicate  mainly 
in  spleen,  liver  and  bone  marrow,  fonning  granulomas  and  resulting  in  serious 
complications.  Visceral  leishmaniasis  is  characterized  by  weight  loss,  fever, 
lymphadanopathy  and  hepatosplenomegaly,  among  other  numerous  conditions.  The 
visceral  form  of  the  disease  can  be  caused  by  L.  donovani,  L.  infantum  and  L.  tropica  in 
the  Old  World  and  by  L.  infantum  chagasi  and  L.  amazonensis  in  the  New  World. 
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Figure  1.  World  maps  indicating  endemic  areas  of  visceral  and  cutaneous  leishmaniasis. 
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Interestingly,  L.  infantum  chagasi  is  a  New  World  variant  of  L.  infantum  [5].  Visceral 
leishmaniasis  is  a  fatal  disease  if  left  untreated,  and  treatment  is  a  lengthy  procedure 
commonly  requiring  at  least  3-4  weeks  of  drug  administration  and  possibly  inpatient 
monitoring  depending  upon  the  drug  used  [6].  . 

Cutaneous  leishmaniasis  begins  as  a  papule  at  the  sand  fly  bite  site,  as  the  parasite 
continues  to  recruit  and  replicate  in  macrophages.  During  this  period  the  cellular 
response  is  being  mounted  or  a  T-helper  2  cellular  response  is  initiated,  both  of  which  are 
ineffective  at  activating  the  leishmanicidal  activity  of  infected  macrophages.  The 
infection  can  develop  to  an  ulcerated  lesion  if  the  T-helper  1  cellular  response  is  vigorous 
enough  to  cause  adjacent  tissue  damage  once  recruited  macrophages  are  activated  to  kill 
the  Leishmania  amastigotes.  Diffuse  cutaneous  leishmaniasis  is  a  variation  of  common 
cutaneous  leishmaniasis,  which  results  in  a  numerous  skin  lesions  due  to  an  inefficient 
cell-mediated  immune  response  to  the  parasite.  Cutaneous  leishmaniasis  can  be  caused 
by  numerous  species  of  Leishmania  parasite  and  includes  L.  major ,  L.  tropica,  and  L. 
aethiopica  in  the  Old  World  and  L.  mexicana,  L.  braziliensis,  and  L.  peruviana  in  the 
New  World  [6].  Diffuse  cutaneous  leishmaniasis  is  more  commonly  associated  with  L. 
aethiopica  and  L.  mexicana  parasites.  Mucosal  leishmaniasis  is  a  South  American 
disease  most  commonly  caused  by  L.  braziliensis  and  is  characterized  by  mucosal 
ulcerations  and  disfigurement  [4].  Most  cutaneous  lesions  will  resolve  over  time,  leaving 
a  scar;  however,  treatment  for  persistent  lesions  and  to  reduce  scaring  can  consist  of 
paromomycin  ointment,  antimony,  or  heat  therapy  [4].  Diffuse  cutaneous  and  mucosal 
leishmaniases  are  more  difficult  to  treat  and  immunotherapy  may  be  applied  in 
conjunction  with  amphotericin  or  antimony  treatment.  These  are  examples  of  the  more 
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commonly  diagnosed  leishmaniasis  pathogens;  however,  the  number  of  parasite  species 
and  the  disease  afflictions  that  they  can  cause  are  far  more  numerous.  The  number  of 
species  of  Leishmania  capable  of  causing  human  disease  is  greater  than  any  other 
parasitic  infection,  and  a  complementary  complexity  exists  in  the  number  of  sand  fly 
species  that  are  proven  or  possible  insect  vectors  of  Leishmania  parasites. 

Sand  flies 

Sand  flies  are  small  Dipteran  insects  of  the  Family  Psychodidae  and  Subfamily 
Phlebotominae,  which  require  a  blood  meal  from  a  vertebrate  host  for  egg  production.  It 
is  during  the  acquisition  of  the  blood  meal  that  there  is  the  potential  for  disease 
transmission.  Phlebotomus  and  Lutzomyia  are  the  most  important  genera  of 
anthropophilic  sand  flies,  as  they  are  vectors  of  human  and  animal  diseases.  Of  the 
approximately  700  species  of  sand  flies,  about  80  have  been  implicated  as  disease 
vectors.  Some  of  the  sand  flies  implicated  in  transmitting  Leishmania  have  not  been  fully 
incriminated  as  definitive  vectors  (Table  1,2)  [7-11].  A  large  research  effort  at  the  turn  of 
the  19th  century  was  instrumental  to  incriminate  sand  fly  transmission  of  L.  tropica 
(1941)  and  L.  infantum  chagasi  (1977)  [8,  12].  Figure  2  is  a  historical  timeline  of  events 
leading  to  the  eventual  definitive  incrimination  of  sand  flies  as  Leishmania  vectors  in 
both  the  New  World  and  the  Old  World  [13-27].  The  work  of  those  pioneering 
individuals,  spanning  nearly  100  years,  set  the  foundation  of  our  current  understanding  of 
the  inherent  complexity  in  the  epidemiology  and  basic  biology  of  the  transmission  of 
Leishmania  parasites. 
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Table  1:  Old  World  distribution  of  Leishmania  and  the  proven  or  suspected  vectors 


Parasite  (Disease)1 

Sand  fly  species2 

Sand  fly  distribution3 

Leishmania  aethiopica  (LCL,  DCL) 

Phlebotomus  longipes 

Kenya,  Ethiopia 

P.  pedifer 

Kenya,  Ethiopia 

P.  sergenti 

Ethiopia 

L.  donovani  (LCL,  PKDL,  VL) 

P.  alexandri 

N.  Africa  to  W.  China 

P.  argentipes 

Bangladesh,  Nepal,  India 

P.  celiae 

Kenya,  S.  Ethiopia 

P.  chinensis 

Northern,  Central  China 

P.  longiductus 

India 

P.  martini 

E.  Africa,  Ethiopia 

P.  mongolensis 

Central  Asia 

P.  orientalis 

Sudan,  Ethiopia,  Saudi  Arabia,  Yemen 

L.  infantum  (LCL,  VL) 

P.  ariasi 

Western  Mediterranean 

P.  brevis 

Northern  Iran  to  Caucasus 

P.  chinensis 

Northern,  Central  China 

P.  halepensis 

Jordan,  Lebandon,  Iraq 

P.  kandelakii 

Iran,  Afghanistan 

P.  langeroni 

Egypt-Tunisia 

P.  longicuspis 

North  Africa 

P.  neglectus 

North  Africa,  Central  Asia 

P.  perfiliewi 

Mediterranean  Basin,  Algeria 

P.  perniciosus 

Western  Mediterranean 

P.  sichuanensis 

China 

P.  smirnovi 

Central  Asia 

P.  tobbi 

Eastern  Mediterranean 

P.  transcaucasicus 

Caucasus 

L.  killicki  (LCL) 

P.  alexandri 

Central  Tunisia 

P.  chaubaudi 

Central  Tunisia 

P.  papatasi 

Central  Tunisia 

L.  major  (LCL) 

P.  alexandri 

North  Africa  to  Western  China 

P.  ansarii 

Iran 

P.  caucascicus 

Iran 

P.  duboscqi 

Sahelian  Africa,  Kenya 

P.  papatasi 

N.  Africa,  Middle  East 

P.  salehi 

Iran,  Pakistan 

L.  tropica  (LCL,  VL) 

P.  aculeatus 

Kenya 

P.  guggisbergi 

Kenya 

P.  halepensis 

South  Caucasus  Eurasia 

P.  longiductus 

India 

P.  sergenti 

Middle  East,  North  Africa 

kala-azar  dermal  leishmaniasis;  VL,  visceral  leishmaniasis 

2Species  names  which  are  bolded  are  considered  proven,  although  not  necessarily  fully  incriminated, 
vectors.  Other  species  are  implicated  as  vectors  of  leishmaniasis  based  on  fewer  incriminating  factors. 
3Sand  fly  distribution  listed  here  is  in  correlation  with  the  distribution  of  the  respective  leishmaniasis. 
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Table  2:  New  World  distribution  of  Leishmania  and  the  proven  or  suspected  vectors 


Parasite  (Disease)1 

Sand  fly  species 

Sand  fly  distribution 

Leishmania  (L.)  amazonensis  (LCL, 

Lutzomyia  flaviscutellata 

Northern  South  America 

DCL,  VL) 

Lu.  olmeca  nociva 

Amazon  basin 

L.  (V.)  braziliensis  (LCL,  MCL) 

Lu.  amazonensis 

Northern  Amazon  basin 

Lu.  ayrozai 

Southeastern  Brazil 

Lu.  carrerrai 

Western  Amazon  basin 

Lu.  complexa 

Para,  Brazil 

Lu.  intermedia 

Southern  Brazil 

Lu.  Ilanomartinsi 

Brazil 

Lu.  migonei 

Brazil,  Venezuela 

Lu.  ovallesi 

Guatemala,  Venezuela 

Lu.  panamensis 

Central,  N.  South  America 

Lu.  paraensis 

N.  South  America 

Lu.  pessoai 

Southern  Brazil 

Lu.  spinicrassa 

Colombia 

Lu.  trinidadensis 

Venezuela 

Lu.  wellcomei 

Para,  Brazil 

Lu.  whitmani 

Eastern  Brazil 

Lu.  yucumensis 

Bolivia 

L.  (L.)  chagasi  (LCL,  VL) 

Lu.  evansi 

Colombia 

Lu.  longipalpis 

Central  and  South  America 

L.  (V.)  garnhami  (LCL) 

Lu.  youngi 

Venezuela 

L.  (V.)  guayanensis  (LCL,  MCL) 

Lu.  anduzei 

N.  South  America 

Lu.  umbratilis 

Amazon  basin 

L.  (V.)  lainsoni  (LCL  MCL) 

Lu.  ubiqutalis 

Amazon  basin 

L.  (L.)  mexicana  (LCL,  DCL) 

Lu.  anthophora 

Southern  Texas,  USA 

Lu.  ayacuchensis 

Ecuador 

Lu.  diabolica 

Southern  Texas,  USA 

Lu.  olmeca  olmeca 

Central  America 

Lu.  ylephiletor 

Guatamala 

L.  (V.)  lindenbergi  (CL) 

Lu.  antunesl 

Para,  Brazil 

L  (V.)  naiffi  (CL) 

Lu.  squamiventris 

Brazil 

L.  (V.)  panamensis  (LCL,  MCL) 

Lu.  gomezi 

Central,  N.  South  America 

Lu.  panamensis 

Central,  N.  South  America 

Lu.  trapidoi 

Central  America 

Lu.  ylephiletor 

Central  America 

L.  (V.)  peruviana  (LCL) 

Lu.  ayacuchensis 

Peru 

Lu.  peruensis 

Peru 

Lu.  tejadai 

Peru 

Lu.  verrucarum 

Peru 

L.  (L.)  pifanoi  (CL) 

Lu.  flaviscutellata 

N.  South  America 

L.  (V.)  shawi  (CL) 

Lu.  whitmani 

Brazil 

L.  (L.)  venezuelensis  (CL) 

Lu.  olmeca 

N.  South  America 

Abbreviations:  (L),  subgenus  Leishmania ;  ( V .),  subgenus  Viannia\  LCL,  localized  cutaneous 
leishmaniasis:  DCL,  diffuse  cutaneous  leishmaniasis:  MCL,  mucocutaneous  leishmaniasis;  VL,  visceral 
leishmaniasis 

2Species  names  in  bolded  are  considered  proven,  although  not  necessarily  fully  incriminated,  vectors.  Other 
species  are  implicated  as  vectors  of  leishmaniasis  based  on  fewer  incriminating  factors. 

3Sand  fly  distribution  listed  here  is  in  correlation  with  the  distribution  of  the  respective  leishmaniasis. 
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Figure  2.  Timeline  of  events  leading  to  the  incrimination  of  New  World  and  Old  World 
sand  fly  vectors  of  leishmaniasis. 

Circles  denote  Old  World  sand  fly  findings  and  diamonds  indicate  New  World 


sand  fly  research. 
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Deane  &  Deane  describe  heavy  flagellate  infections  in  Lu.  longipalpis  and  find  foxes  infected  with  L. 
chagasi.  Additionally,  they  show  that  Lu.  longipalpis  became  infected  when  fed  on  infected  foxes.  17-1 
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Leishmania  life  cycle 

Leishmania  has  a  two-stage  life  cycle;  one  in  the  mammalian  host  as  an 
intracellular  amastigote,  and  one  in  the  sand  fly  vector  as  an  extracellular  motile 
promastigote.  After  a  vertebrate  host  receives  a  bite  from  an  infectious  sand  fly,  the 
inoculated  metacyclic  promastigote  parasite  invades  host  macrophages  and  multiplies 
within  the  phagolysosome.  If  an  uninfected  sand  fly  feeds  on  an  infectious  host,  it 
ingests  a  blood  meal  containing  amastigote-infected  macrophages;  thus,  beginning  the 
life  cycle  in  the  invertebrate  host  (Figure  3.1).  Amastigotes  are  released  after  rupture  of 
the  macrophage,  and  the  parasite  begins  the  developmental  cycle  to  the  first  flagellated 
form  of  the  parasite  within  the  sand  fly,  the  procyclic  promastigote  (Figure  3.2). 

The  proliferation  and  differentiation  of  the  first  parasite  stages  occurs  within  the 
peritrophic  matrix  (PM),  a  proteo-chitin  structure  formed  to  encapsulate  the  blood  meal 
immediately  after  feeding.  The  PM  offers  a  relatively  protected  environment  for  the 
Leishmania  during  the  first  hours  after  the  blood  meal,  as  the  amastigote  is  susceptible  to 
killing  by  digestive  enzymes  [28],  The  procyclic  form  of  the  promastigote  is  the  first 
multiplying  stage  of  the  parasite  within  the  sand  fly  and  forms  rosettes  of  replicating 
parasites  within  the  PM  as  the  blood  meal  is  digested  [29].  Parasites  of  the  subgenus 
Leishmania  develop  only  in  the  midgut  and  foregut  (suprapylarian);  whereas,  parasites  of 
the  subgenus  Viannia  have  a  developmental  stage  occurring  in  the  hindgut  (peripylarian) 
[30,  31].  The  outer  surface  of  procyclic  promastigotes  is  covered  in  a  dense  layer  of 
lipophosphoglycans  (LPG),  a  glycoconjugate  that  has  multiple  functions  [32],  LPG  has 
been  shown  to  determine  sand  fly  species  restriction  and  is  the  ligand  necessary  for 
parasite  attachment  to  the  midgut  epithelium  in  what  are  usually 
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Figure  3.  Life  cycle  of  Leishmania  parasites  within  the  sand  fly. 
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referred  to  as  restrictive  vectors,  such  as  P.  sergenti  and  P.  papatasi  [33,  34].  Sand  flies 
that  are  considered  permissive  (they  are  able  to  harbor  several  different  species  of 
Leishmania )  have  been  shown  to  possess  a  LPG-independent  mechanism  of  Leishmania 
retention  [35].  The  next  developmental  form  is  the  nectomonad  promastigote  (Figure 
3.3),  a  non-dividing  and  highly  motile  stage  that  is  most  abundant  about  three  days  after 
the  sand  fly  ingests  the  infected  blood  meal.  In  Lu.  longipalpis  infected  with  L. 
mexicana,  the  nectomonad  promastigotes  migrate  to  the  anterior  area  of  the  PM;  whereas, 
the  procyclic  promastigotes  migrate  to  the  posterior  area  of  the  PM  [36],  At  this  point  in 
development  the  nectomonad  promastigotes  must  escape  the  PM  and  attach  to  the  midgut 
epithelium  (Figure  3.3).  Escape  from  the  PM  is  facilitated  by  a  Leishmania- derived 
secretory  chitinase  and  likely  by  the  sand  fly  midgut  chitinase  [37,  38],  Electron 
microscopy  has  shown  that  the  flagellum  of  nectomonad  promastigotes  appear  to  be 
inserted  between  the  microvilli  of  the  midgut  epithelium  and  that  there  are  even  instances 
of  flagella  entering  degenerating  epithelial  cells  [30].  The  exact  role  or  outcome  of  this 
interaction  is  not  clear;  however,  it  is  known  that  the  parasite  must  bind  to  the  midgut  to 
prevent  elimination  during  the  defecation  of  the  digested  blood  meal.  The  next  parasite 
form  in  development  is  the  leptomonad  promastigote  (Figure  3.5);  the  role  of  which  is  to 
expand  the  parasite  population  within  the  thoracic  midgut  and  produce  promastigote 
secretory  gel  (PSG),  an  important  component  in  transmission  [36].  Haptomonads,  a 
promastigote  that  is  likely  derived  from  the  leptomonad,  bind  to  the  cuticular  surface  of 
the  stomodeal  valve  by  a  hemidesmosome-like  interaction  of  the  flagellum  (Figure  3.4) 
[30].  Leptomonads  also  give  rise  to  the  infective  form  of  parasite,  the  metacyclic 
promastigote  (Figure  3.6).  The  metacyclic  promastigote  is  a  smaller,  highly  motile  form 
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of  the  parasite  that  is  covered  in  an  altered  configuration  of  LPG  side  chains,  negating 
attachment  to  the  midgut  epithelium  [39].  It  is  generally  regarded  as  necessary  for  the 
parasite  to  develop  to  the  metacyclic  promastigote  for  transmission  to  occur  to  a 
mammalian  host,  as  this  form  of  the  promastigote  is  not  only  the  most  infective  but  also 
greatly  resists  killing  by  normal  human  serum  [40,  41]. 

The  two  main  theories  of  transmission  are  direct  inoculation  via  the  proboscis 
with  promastigotes  originating  from  the  anterior  foregut,  or  regurgitation  of 
promastigotes,  which  brings  promastigotes  from  behind  the  pharynx  from  the  posterior 
foregut  and  anterior  midgut  [42].  While  the  exact  mechanics  of  transmission  are  still 
under  debate,  recent  work  has  shown  that  about  10  times  the  number  of  promastigotes  are 
egested  during  feeding  than  are  present  in  the  foregut,  implicating  regurgitation  as  the 
principal  mechanism  of  transmission  [43].  Two  theories  explaining  the  manner  in  which 
parasites  are  regurgitated  are  the  fonnation  of  a  plug  and  Leishmania- induced  damage  to 
the  stomodeal  valve.  One  proposed  cause  of  regurgitation  of  parasites  during  blood¬ 
feeding  was  based  on  observations  of  damage  to  the  cuticle  of  the  stomodeal  valve  in  L. 
mayor-infected  P.  papatasi,  linked  to  a  chitinase  enzyme  produced  by  the  parasite  [44]. 
The  stomodeal  valve  is  a  sphincter  muscle-controlled  ring  of  cuticular  epithelia  that 
separates  the  cardia  from  the  pharynx.  The  proposed  theory  is  that  damage  to  the 
stomodeal  valve  allows  a  flushing  and  regurgitation  of  the  contents  of  the  cardia  into  the 
pharynx,  through  the  proboscis,  and  then  into  the  bite  site  [44].  Additionally,  this  has 
been  repeatedly  observed  in  a  number  of  sand  fly  species  infected  with  different  species 
of  Leishmania  [44,  45].  Prior  to  this  observation,  it  was  believed  that  the  Leishmania 
parasites  caused  a  blockage  in  the  foregut,  which  caused  regurgitation  in  a  manner 


14 


synonymous  with  the  transmission  of  Yersinia  pestis  by  fleas.  This  older  theory  was 
reinforced  by  the  discovery  of  a  gel-like  plug  in  the  anterior  midgut  consisting  of 
metacyclic  promastigotes  and  promastigote  secretory  gel  (PSG),  a  substance  composed 
primarily  of  filamentous  proteophosphoglycan  [43].  Additionally,  PSG  was  shown  to 
greatly  enhance  cutaneous  leishmaniasis  lesion  size  and  persistence;  an  attribute  shared 
with  molecules  of  sand  fly  saliva  [25]. 

Sand  flies  that  become  infected  with  Leishmania  parasites  that  then  replicate  and 
are  successfully  transmitted  are  considered  competent  vectors.  Within  the  sand  fly 
midgut,  there  are  abundant  molecular  interactions  that  are  the  determinants  of  species- 
specific  vector  competence  and  include  resisting  digestive  enzymes,  escaping  from  the 
PM  and  binding  to  the  midgut  epithelium.  The  earliest  obstacle  faced  by  the  recently 
ingested  parasite  is  the  attack  by  proteolytic  enzymes.  There  are  numerous  proteases, 
including  trypsins,  chymotrypsins,  amino-  and  carboxypeptidases,  within  the  midgut 
lumen  that  are  induced  by  blood-feeding,  facilitate  blood  meal  digestion  and  likely  confer 
some  immunity  to  ingested  organisms.  The  presence  of  Leishmania  parasites  in  the 
midgut  lumen  of  sand  flies  infected  using  promastigotes  changes  the  overall  protease 
activity,  inhibiting  or  delaying  proteolytic  enzymes  [46,  47].  Infections  initiated  using 
amastigotes,  a  more  natural  representation  of  the  infection  route,  caused  a  delay  in  trypsin 
and  aminopeptidase  activity  [48].  A  delay  in  protease  activity  would  allow  the  ingested 
amastigotes  to  transform  to  the  LPG-covered,  and  thus  more  protease  resistant, 
promastigote  form  of  the  parasite.  Until  recently,  it  has  been  unclear  which  specific 
proteolytic  molecules  are  up-  or  downregulated  by  the  presence  of  the  parasite  within  the 
sand  fly  midgut. 
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Further  protection  is  offered  by  the  peritrophic  matrix  during  the  first  few  hours 
after  the  blood  meal  is  ingested.  The  addition  of  an  exogenous  chitinase,  which 
prevented  PM  formation,  resulted  in  high  parasite  death  despite  the  low  abundance  of 
proteases  immediately  following  the  blood  meal  [28],  While  the  PM  offers  protection 
early  in  the  establishment  of  the  parasite  within  the  midgut,  failure  of  the  Leishmania  to 
escape  the  PM  results  in  the  parasite  being  defecated  along  with  the  digested  blood  meal. 
In  uninfected  P.  papatasi,  the  posterior  area  of  the  PM  begins  to  break  down,  likely 
attributed  to  intrinsic  sand  fly  midgut  chitinase,  about  four  days  post-blood  meal  and  is 
excreted  with  the  digested  products.  Phlebotomus  papatasi  infected  with  L.  major  shows 
break  down  of  the  PM  at  48  hours  post-blood  meal  in  the  posterior  area  and  additionally 
in  the  anterior  area,  where  there  is  a  congregation  of  parasites  [37].  Further  evidence  that 
escape  from  the  PM  requires  a  Leishmania- specific  competence  factor  was  demonstrated 
by  the  non-natural  infection  of  P.  papatasi  with  L.  panamensis,  which  led  to  the  riddance 
of  the  promastigotes  enclosed  in  the  PM  when  the  sand  fly  excreted  the  digested  blood 
meal  —  demonstrating  a  species-specific  developmental  barrier  [49].  With  chitinase 
genes  identified  and  the  molecules  characterized  in  Leishmania  and  the  sand  fly  midgut, 
determining  the  role  of  chitinolytic  enzymes  in  species-specific  vector  competence  would 
be  best  described  using  knockout  mutant  strains  of  parasites  and  RNA  knockdown  in 
sand  flies.  Additionally,  it  is  prudent  to  theorize  that  the  Leishmania  parasite  may 
influence  the  sand  fly  chitinase  molecule  and  peritrophins,  a  protein  component  of  the 
PM,  in  an  effort  to  bypass  this  physical  barrier  prior  to  binding  to  the  midgut  epithelium. 

Once  free  of  the  PM,  the  nectomonad  promastigotes  must  adhere  to  the  midgut 
epithelium  to  complete  development  within  the  sand  fly  host.  The  binding  must  occur 


16 


prior  to  defecation  of  the  digested  blood  or  the  majority  of  parasites  will  be  lost  in  the 
excrement  and  the  sand  fly  may  not  become  infective.  It  has  been  well  described  that 
lipophosphoglycan  (LPG)  on  the  surface  of  the  Leishmania  is  responsible  for  the  binding 
of  parasites  to  the  midgut  microvilli  in  certain  species  of  Leishmania- sand  fly  pairings.  It 
was  first  documented  that  specific  oligosaccharides  on  the  phosphoglycan  of  L.  major 
procyclic  promastigotes  could  bind  the  midgut  of  P.  papatasi  [50].  A  similar  study 
demonstrated  stage-specific  binding  of  L.  donovani  in  P.  argentipes  and  used  purified 
LPG  from  procyclics  to  show  inhibition  of  procyclic  binding;  whereas,  purified  LPG 
from  metacyclic  parasites  did  not  inhibit  procyclic  binding  to  the  midgut  [39]. 
Subsequently,  the  receptor  for  L.  major  in  P.  papatasi  was  identified  as  a  galectin, 
PpGalec,  binding  specifically  to  the  galactose  residues  of  LPG  [34].  Generation  of  an 
LPG  deficient  mutant  of  L.  major  showed  the  importance  of  parasite  binding,  mediated 
by  LPG,  in  the  development  of  a  transmissible  parasitemia  in  P.  papatasi  [51].  Recent 
work  using  LPG  deficient  L.  major  infections  in  the  pennissive  vectors  Lu.  longipalpis 
and  P.  arabicus  demonstrated  that  LPG  is  not  required  for  the  development  of  heavy 
promastigote  infections  in  these  sand  flies  [35],  Additional  work  generated  the  theory 
that  GalNac-containing  glycoproteins  on  the  midgut  epithelia  are  the  ligands  to  which  a 
parasite  lectin  receptor  binds,  retains  and  allows  for  full  development  of  several  different 
Leishmania  species  in  permissive  sand  flies  [35]. 

Sandfly  saliva 

When  the  sand  fly  bites  a  host,  a  cocktail  of  pharmacologically  active  molecules 
within  the  salivary  glands  are  injected  into  the  skin  of  the  host,  which  facilitates 
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successful  blood  meal  acquisition.  The  saliva  cocktail  contains  molecules  exhibiting 
anti-platelet,  vasodilator,  anticoagulant,  anti-inflammatory  and  immunomodulatory 
activities  [52-57].  Researching  the  saliva  of  hematophagous  insects  has  improved  the 
understanding  of  the  evolution  of  blood-feeding,  and  more  importantly,  the  impact  of 
insect  saliva  on  pathogen  transmission  and  disease  progression.  During  the  transmission 
of  Leishmania  parasites  by  the  bite  of  a  sand  fly,  there  is  the  ubiquitous  co-inoculation  of 
infective  stage  metacyclic  promastigotes  and  saliva.  The  experiments  of  Titus  and 
Ribeiro  (1988),  which  demonstrated  the  enhancement  of  cutaneous  leishmaniasis  by  the 
co-injection  ofZ.  major  with  Lit.  longipalpis  salivary  gland  homogenate,  served  as  a 
stepping  stone  for  further  research  on  Leishmania  susceptibility  and  resistance  conferred 
by  sand  fly  saliva  [58],  The  exacerbation  of  leishmaniasis  by  saliva  may  be  a  co¬ 
evolutionary  mechanism  that  is  essential  for  the  propagation  and  maintenance  of  parasite 
populations  in  the  wild. 

Inhabitants  of  endemic  areas  show  resilience  to  infection  in  comparison  to 
individuals  from  non-endemic  areas  who  acquire  leishmaniasis  when  entering  endemic 
areas.  Converse  to  enhancing  disease,  pre-exposure  to  sand  fly  saliva  and  inoculation 
with  specific  salivary  proteins  confers  protection  against  disease.  This  demonstrates  the 
potential  use  of  saliva-based  vaccines  against  leishmaniasis  [59-61].  The  outcome  of 
saliva-mediated  disease  exacerbation  or  protection  is  believed  to  be  a  product  of  immune 
modulation  by  specific  salivary  molecules.  Saliva-based  protection  is  characterized  by 
the  development  of  delayed-type  hypersensitivity  at  the  bite  site  with  increased 
interferon-y  and  interleukin- 12,  T-helper  type  1  cellular  immune  response  indicators  [61]. 
A  high-throughput  approach  to  identifying  potential  salivary-based  vaccine  candidates 
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was  developed,  implementing  the  sequencing  of  salivary  gland  transcriptomes, 
proteomics  and  reverse  antigen  screening  [62]. 

Premise 

The  use  of  functional  transcriptomics  (the  exploitation  of  the  wealth  of  knowledge 
produced  by  the  massive  amounts  of  cDNA  sequencing  data  in  an  effort  to  formulate  and 
test  biologically  pertinent  hypotheses)  can  be  instrumental  in  the  identification  of  vaccine 
candidates  for  leishmaniasis  as  well  as  novel  insect  salivary  molecules  with  potential 
phannaceutical  applications.  In  Chapter  2, 1  demonstrate  the  application  of  functional 
transcriptomics  in  the  biochemical  characterization  of  an  adenosine  deaminase  (ADA) 
molecule  in  Phlebotomus  duboscqi  saliva,  identified  by  the  surprising  presence  of  ADA 
sequences  in  a  salivary  gland  transcrip  tome.  Furthermore,  the  complex  interactions 
occurring  between  the  Leishmania  parasite  and  the  sand  fly  midgut  go  beyond  simple 
ligand  binding.  Just  as  amastigotes  invade  and  sabotage  the  microbicidal  activity  of  the 
host  macrophage,  there  are,  presumably,  interactions  between  vector  and  parasite  surface 
and  secreted  molecules  that  influence  Leishmania  colonization  and  maturation  within  the 
sand  fly.  I  propose  that  Leishmania  colonization  of  the  midgut  affects  the  transcript 
abundance  within  the  midgut  tissue  of  the  sand  fly  vector,  whether  due  to  subversion  of 
nonnal  physiological  conditions  of  the  midgut  tissue  in  the  sand  fly  for  the  benefit  of  the 
parasite  or  a  response  by  the  sand  fly  to  the  presence  of  Leishmania  colonizing  the 
alimentary  canal.  Functional  transcriptomic  methodologies  were  employed  in  the 
construction  and  comparative  analysis  of  P.  papatasi  and  Lu.  longipalpis  midgut  cDNA 
libraries,  as  described  in  Chapters  3  and  4.  Chapter  5  further  addresses  the  influence  of 
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L.  infantum  chagasi  on  midgut  transcript  abundance  in  La.  longipalpis  and  provides 
insights  into  the  molecular  interactions  between  the  parasite  and  the  vector. 
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Abstract 

Two  transcripts  coding  for  an  adenosine  deaminase  (ADA)  were  identified  by 
sequencing  a  Phlebotomus  duboscqi  salivary  gland  cDNA  library.  Adenosine  deaminase 
was  previously  reported  in  the  saliva  of  the  sand  fly  Lutzomyia  longipalpis,  but  it  was  not 
present  in  the  saliva  of  the  sand  flies  Phlebotomus  papatasi,  P.  argentipes,  P.  perniciosus 
and  P.  ariasi,  suggesting  that  this  enzyme  is  only  present  in  the  saliva  of  sand  flies  from 
the  genus  Lutzomyia.  In  the  present  work,  we  tested  the  hypothesis  that  the  salivary 
gland  transcript  coding  for  ADA  in  Phlebotomus  duboscqi,  a  sister  species  of 
Phlebotomus  papatasi,  produces  an  active  salivary  ADA.  Salivary  gland  homogenates  of 
P.  duboscqi  converted  adenosine  to  inosine,  suggesting  the  presence  of  ADA  activity  in 
the  saliva  of  this  species  of  sand  fly;  furthermore,  this  enzymatic  activity  was 
significantly  reduced  when  using  either  salivary  glands  of  recently  blood-fed  sand  flies  or 
punctured  salivary  glands,  suggesting  that  this  enzyme  is  secreted  in  the  saliva  of  this 
insect.  This  enzymatic  activity  was  absent  from  the  saliva  of  P.  papatasi.  In  contrast  to 
other  Phlebotomus  sand  flies,  we  did  not  find  AMP  or  adenosine  in  P.  duboscqi  salivary 
glands  as  measured  by  HPLC-photodiode  array.  To  confirm  that  the  transcript  coding  for 
ADA  was  responsible  for  the  activity  observed  in  the  saliva  of  this  sand  fly,  we  cloned 
this  transcript  into  a  prokaryotic  expression  vector  and  produced  a  soluble  and  active 
recombinant  protein  of  approximately  60  kDa  that  was  able  to  convert  adenosine  to 
inosine.  Extracts  of  bacteria  transformed  with  control  plasmids  did  not  show  this 
activity.  These  results  suggest  that  P.  duboscqi  transcripts  coding  for  ADA  are 
responsible  for  the  activity  detected  in  the  salivary  glands  of  this  sand  fly  and  that  P. 
duboscqi  acquired  this  activity  independently  from  other  Phlebotomus  sand  flies.  This  is 
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another  example  of  a  gene  recruitment  event  in  salivary  genes  of  blood-feeding 
arthropods  that  may  be  relevant  for  blood-feeding  and,  because  of  the  role  of  ADA  in 
immunity,  it  may  also  play  a  role  in  parasite  transmission. 

Background 

In  their  saliva,  blood-feeding  arthropods  have  potent  pharmacologically  active 
components  that  help  them  counteract  the  hemostatic  and  inflammatory  system  of  the 
vertebrate  host  each  time  they  attempt  to  take  a  blood  meal  [1],  Vasodilators, 
anticoagulants  and  inhibitors  of  platelet  aggregation  are  part  of  this  salivary  mixture  [2]. 
Recently,  with  the  technological  advances  in  DNA  and  protein  sequencing,  novel  and 
unexpected  molecules  with  potential  biological  activities  have  been  isolated  from  the 
saliva  of  blood-feeding  arthropods.  Such  molecules  include  hyalorunidase, 
nucleotidases,  novel  apyrases,  amine-binding  proteins,  tissue-factor  pathway  inhibitors 
and  others  [2].  Another  such  protein  is  adenosine  deaminase  (ADA),  which  was 
identified  from  transcripts  of  a  salivary  gland  cDNA  library  of  the  New  World  sand  fly 
Lutzomyia  longipalpis  and  the  mosquitoes  Culex  quinquefasciatus  and  Aedes  aegypti  [3, 
4],  This  protein  or  the  transcript  coding  for  this  protein  also  have  been  identified  in  other 
organisms  including  bacteria,  fruit  flies,  mice  and  humans  [5].  Adenosine  deaminase 
(E.C.  3. 5. 4.4)  catalyses  the  conversion  of  adenosine  and  2'-deoxyadenosine  to  inosine 
and  2'-deoxyinosine,  respectively  [6],  This  enzyme  is  evolutionarily  conserved  and  has  a 
beta  alpha,  8  barrel  structure  and  zinc  ion  in  the  catalytic  site  [7],  Adenosine  deaminase 
deficiency  in  mice  results  in  the  impairment  of  T  and  B  cell  function  due  to  the 
accumulation  of  adenosine  resulting  in  severe  combined  immunodeficiency  (SCID)  [8]. 
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The  role  of  AD  As  in  insects,  particularly  in  the  saliva  of  blood-feeding  insects,  was 
proposed  to  be  in  the  hydrolysis  of  adenosine,  a  molecule  involved  in  pain  perception  [5]. 
The  activity  of  this  enzyme  in  blood-feeding  insects  was  demonstrated  in  the  saliva  of  Lu. 
longipalpis,  C.  quinquefasciatus  and  Ae.  aegvpti  from  the  activity  of  the  recombinant 
salivary  ADA  from  Lu.  longipalpis  [3-5].  Of  interest,  ADA  enzymatic  activity  or  the 
transcripts  coding  for  this  enzyme  were  not  present  in  the  salivary  gland  of  the  sand  flies 
P.  argentipes,  P.  papatasi,  P.  ariasi  and  P.  perniciosus,  which  belong  to  the  genus 
Phlebotomus  [3,9].  Instead,  the  saliva  of  P.  papatasi  and  P.  argentipes  contains  large 
amounts  of  adenosine  and  adenosine  monophosphate  (AMP)  [10].  Therefore,  it  appeared 
that  ADA  activity  was  only  present  in  the  saliva  of  Lutzomyia  sand  flies  and  not  in 
Phlebotomus  sand  flies.  Recently,  transcriptome  analysis  of  the  salivary  glands  of  the 
sand  fly  Phlebotomus  duboscqi,  a  sibling  species  of  P.  papatasi,  resulted  in  the 
identification  of  a  transcript  with  homologies  to  ADA  [11].  In  the  present  work,  we 
tested  whether  there  is  ADA  activity  in  P.  duboscqi  and  whether  the  identified  transcript 
codes  for  this  activity.  Because  P.  papatasi  does  not  have  ADA  activity,  but  has  large 
amounts  of  adenosine  and  AMP  in  the  saliva,  we  also  tested  for  the  presence  of  adenosine 
and  AMP  in  the  saliva  of  P.  duboscqi  sand  flies. 

Results 

By  sequencing  a  P.  duboscqi  salivary  gland  cDNA  library,  we  have  identified  two 
transcripts  coding  for  a  protein  homologous  to  ADA,  an  enzyme  that  metabolizes 
adenosine  to  inosine  [11].  The  first  transcript  (PduM73;  NCBI  accession  number 
DQ835357)  of  1846  bp  codes  for  a  secreted  protein  of  57.6  kDa  with  an  isoelectric  point 
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of  5.5;  the  second  transcript  (PduM74)  of  1810  bp  codes  for  a  protein  of  57.2  kDa  with 
an  isoelectric  point  of  5.8  (Figure  4).  Multiple  sequence  comparison  of  P.  duboscqi  ADA 
with  homologues  from  Dipterans  such  as  the  sand  fly  Lu.  longipalpis  and  the  mosquitoes 
Ae.  aegvpti,  Ae.  albopictus  and  Culex  pipiens  and  from  mammals  such  as  mice,  rats  and 
humans  shows  an  overall  low  level  of  identity.  However,  the  amino  acids  forming  part  of 
the  active  site  (Hisii6,  Hisns,  Alam,  Gly328,  His355,  Ghuss,  Gly38i,  Asp44o,  Asp44i)  are  highly 
conserved  (Figure  5).  The  ADA  from  P.  duboscqi  and  from  other  insects  is  larger  than 
the  ADA  from  mice,  rats  or  humans;  a  large  string  of  approximately  80  amino  acids  at 
the  N-terminal  region  is  not  present  in  the  mammalian  ADA.  Additionally,  the  signal 
peptide  sequence  is  not  present  in  the  mammalian  ADA  (Figure  5).  Phylogenetic 
analysis  of  ADA  from  different  organisms  produced  a  tree  with  two  distinct  clades,  one 
containing  ADA  from  Dipteran  blood-feeders  and  the  other  clade  containing  other 
organisms  including  Leishmania,  Plasmodium,  Entamoeba,  mice,  rats  and  humans 
(Figure  6).  Within  the  Dipteran  blood-feeders  clade,  sand  flies  form  a  distinct  group 
separate  from  mosquitoes. 

Salivary  gland  ADA  activity 

Because  of  the  discovery  of  the  ADA  transcripts  in  the  salivary  gland  cDNA  library  of  P. 
duboscqi,  we  wanted  to  test  whether  the  saliva  of  this  sand  fly  had  ADA  activity.  For 
this,  SGH  of  P.  duboscqi  was  incubated  in  the  presence  of  adenosine  and  the  reaction  was 
followed  spectrophotometrically  by  scanning  from  220  nm  to  300  nm  every  3  min.  The 
substrate  adenosine  absorbs  at  265  nm,  and  the  product  of  ADA  activity,  inosine,  absorbs 
at  241  nm.  The  equivalent  of  0.2  salivary  gland  pairs  (0.2  g)  of  P.  duboscqi  converted 
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Figure  4.  Amino  acid  alignment  of  the  two  adenosine  deaminase  (ADA)  molecules 


derived  from  transcripts  found  in  P.  duboscqi  salivary  glands. 


Amino  acids  shaded  black  are  identical  and  those  shaded  gray  are  similar.  The 


secretory  signal  peptide  is  italicized. 
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Figure  5.  Clustal  alignments  of  invertebrate  putative  salivary  adenosine  deaminase 
(ADA)  of  P.  duboscqi  (PduM73  and  PduM74),  Lutzomyia  longipalpis,  Aedes  aegypti,  A. 
albopictus,  Culex pipiens  and  mammalian  ADA  (mouse,  rat  and  human). 

Arrowheads  indicate  conserved  amino  acids  located  in  the  active  site  of  the 
mammalian  enzyme.  Black  shading  indicates  amino  acid  sequence  identity,  and  gray 
regions  indicate  conserved  amino  acid  substitutions. 
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Figure  6.  Phylogenetic  tree  analysis  of  putative  adenosine  deaminase  (ADA). 

Branch  lengths  are  proportional  to  genetic  distance  calculated  by  the  ClustalW 
program.  The  scale  bar  represents  0.5%  divergence. 
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adenosine  to  inosine  in  30  min  (Figure  7A);  by  contrast,  the  same  amount  of  SGH  of  P. 
papatasi  had  no  effect  on  adenosine  (Figure  7B).  Differential  spectrum  shows,  in  better 
detail,  the  decrease  of  adenosine  (265  nm)  and,  over  time,  the  increase  of  inosine 
(241  nm)  in  the  presence  of  P.  duboscqi  SGH  (Figure  1C),  indicating  the  presence  of 
ADA  activity  in  the  salivary  gland  of  this  sand  fly.  To  test  whether  this  activity  is 
secreted  in  the  saliva  of  P.  duboscqi,  we  compared  ADA  activity  from  SGH  of  unfed 
sand  flies  (intact  saliva),  from  SGH  of  recently  blood-fed  sand  flies  (loss  of  secreted 
protein  by  salivation  during  feeding)  and  from  punctured  salivary  glands  (loss  of  all  or 
the  majority  of  the  salivary  contents  and  therefore  any  enzymatic  activity).  Salivary 
glands  from  unfed  sand  flies  had  the  highest  ADA  activity  while  preparations  from  the 
salivary  glands  of  recently  blood-fed  sand  flies  had  approximately  70%  less  activity 
(Figure  8).  Finally,  ADA  activity  was  not  detected  in  the  preparations  of  punctured 
salivary  glands  (Figure  8).  Additionally,  the  amino-terminal  sequence  of  the  native 
protein  was  detected  in  the  secreted  fraction  of  the  SGH  of  this  sand  fly  [11].  These  data 
suggest  that  the  molecule  responsible  for  this  activity  is  secreted  in  the  saliva  of  this  sand 

fly- 

Lack  of  adenosine  and  AMP  in  the  saliva  of  P.  duboscqi 

It  was  previously  shown  that  P.  papatasi  and  P.  argentipes  do  not  have  transcripts  coding 
for  the  enzyme  ADA  in  their  salivary  glands  or  the  activity  was  not  detected  within  their 
salivary  glands;  however,  it  was  shown  that  these  sand  flies  have  large  amounts  of 
adenosine  and  AMP  within  their  salivary  glands  [3,9,  12].  Although  counterintuitive, 


due 
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Wavelength  (nm) 

Figure  7.  ADA  activity  of  salivary  homogenates  of  P.  duboscqi. 

A  cuvette  containing  20  pM  adenosine  in  PBS  was  scanned  at  3  min  intervals  for 
30  min  following  addition  of  salivary  homogenate  equivalent  to  0.2  pairs  of  salivary 
gland  from  P.  duboscqi  (A)  and  P.  papatasi  (B).  (C)  Differential  spectra  of  the  data  in 
(A)  where  obtained  by  subtracting  each  scan  from  the  scan  at  time  zero.  The  arrows 
indicate  the  direction  of  change  of  the  spectrum  over  time. 
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Unfed  (N=8) 
Blood-fed  (A/=7) 


Figure  8.  Salivary  adenosine  deaminase  (ADA)  activity  from  salivary  gland  homogenate 
(SGH)  of  unfed  and  blood-fed  sand  flies  and  from  punctured  salivary  glands. 

A  cuvette  containing  20  pM  adenosine  in  PBS  was  scanned  at  1.5  min  intervals 
for  15  min  following  addition  of  salivary  homogenate  equivalent  to  0.2  pairs  of  salivary 
gland. 


40 


to  the  presence  of  ADA  activity,  P.  duboscqi  salivary  glands  were  tested  for  the  presence 
of  adenosine  and  AMP  by  subjecting  P.  duboscqi  SGH  to  molecular  sieving-HPLC  (MS- 
HPLC),  and  the  eluted  products  were  detected  by  photodiode  array  detection.  As 
expected,  and  as  previously  shown,  analysis  of  P.  papatasi  SGH  resulted  in  the  presence 
of  two  major  peaks  with  the  same  retention  times  as  adenosine  (18.5  min)  and  AMP 
(14.5  min),  respectively  (Figure  9).  By  contrast,  P.  duboscqi  SGH  showed  no  peaks  at 
the  retention  times  of  adenosine  and  AMP  (Figure  9).  Only  a  peak  at  5  min  was 
observed,  which  is  the  secreted  proteins  from  the  salivary  glands,  as  determined  by  the 
retention  time  and  the  absorption  spectra  at  280  nm  (Figure  9).  These  data  suggest  that, 
in  contrast  to  P.  papatasi  and  P.  argentipes,  P.  duboscqi  does  not  have  AMP  or 
adenosine  in  its  salivary  glands  and  that  it  contains  the  active  salivary  ADA. 

Expression  and  activity  of  recombinant  P.  duboscqi  salivary  ADA 

In  order  to  determine  if  the  ADA  activity  detected  in  P.  duboscqi  SGH  was  related  to  the 
transcript  coding  for  this  enzyme,  we  cloned  the  two  transcripts  coding  for  this  protein 
into  the  PCRT7NT-TOPO  bacterial  expression  vector.  The  soluble  expressed  proteins 
were  purified  from  the  supernatant  of  bacterial  lysate  by  nickel  magnetic  beads,  and  an 
aliquot  was  subjected  to  western  blot  analysis  and  detected  using  anti-histidine  antibody. 
This  revealed  a  protein  of  approximately  60  kDa,  which  is  the  estimated  molecular  mass 
of  the  predicted  ADA  including  the  4-kDa  N-terminal  addition  that  includes  HiseG  and 
XpressTM  peptide  epitopes  (Figure  10A,  lanes  1  and  2).  No  protein  of  this  molecular  mass 
was  detected  in  the  supernatant  of  bacteria  expressing  the  empty  vector  (Figure  10  A,  lane 
3).  Furthermore,  a  soluble  expressed  protein  with  the  same  migration  pattern  was 
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Retention  time  (min) 


Figure  9.  Three-dimensional  chromatographic  display  of  photodiode  array  data  obtained 
from  MS-HPLC  of  P.  papatasi  and  P.  duboscqi  salivary  gland  homogenate  (SGH). 

The  three  dimensional  data  show  retention  time  on  the  x-axis,  UV  absorbance  on 
the  y-axis  (in  milli-absorbance  units)  and  the  UV  absorbance  spectra  on  the  z-axis  (from 
200  nm  to  320*nm).  Adenosine  and  AMP  standards  show  retention  times  of  18.5  and 


14.5  min,  respectively. 
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detected  by  Coomassie  blue  staining  (Figure  10B,  lanes  1  and  2),  and  this  protein  was  not 
detected  in  samples  from  bacteria  transformed  with  control  plasmid  (Figure  10B,  lane  3). 
The  purified  soluble  recombinant  proteins  were  then  tested  for  the  presence  of  ADA 
activity.  Both  sand  fly  recombinant  proteins  had  a  high  level  of  ADA  activity  as  detected 
spectrophotometrically  by  the  conversion  of  adenosine  to  inosine  (Figure  1 1A  and  1  IB). 
This  activity  was  not  detected  in  the  supernatant  of  bacteria  transfonned  with  the  control 
plasmid  (Figure  1 1C).  These  data  suggest  that  the  P.  duboscqi  transcript  coding  for  an 
ADA  is  responsible  for  the  ADA  activity  detected  in  the  saliva  of  this  sand  fly. 

Discussion 

Phlebotomus  duboscqi  is  a  proven  vector  of  Leishmania  major  in  sub-Saharan  Africa.  It 
belongs  to  the  subgenus  Phlebotomus,  together  with  the  sand  fly  vector  P.  papatasi. 
Knowledge  of  the  repertoire  of  salivary  activities  or  molecules  from  the  saliva  of  P. 
duboscqi  is  very  limited.  We  recently  sequenced  a  large  number  of  transcripts  from  the 
salivary  gland  of  P.  duboscqi  and  identified  a  transcript  coding  for  the  enzyme  ADA  [11]. 
This  is  the  first  report  of  ADA  in  a  sand  fly  from  the  genus  Phlebotomus,  including  data 
from  transcriptome  analysis  from  the  salivary  glands  of  P.  papatasi,  P.  ariasi,  P. 
argentipes  and  P.  perniciosus  sand  flies  [9],  In  the  present  work,  we  have  demonstrated 
the  presence  of  ADA  activity  in  the  saliva  of  P.  duboscqi,  and  we  also  demonstrated  that 
the  soluble  recombinant  protein  produced  from  the  transcript  coding  for  this  enzyme 
exhibited  ADA  activity.  Phylogenetic  analysis  placed  P.  duboscqi  ADA  in  the  same 
clade  with  ADA  from  other  blood-feeding  arthropods.  This  group  belongs  to  the 
ADGF/CECR1  family  of  proteins  identified  previously  in  Sarcophaga  peregrina. 
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Figure  10.  Expression  of  recombinant  P.  duboscqi  salivary  adenosine  deaminase 
(ADA). 

cDNAs  encoding  P.  duboscqi  salivary  ADA  (PduM73  and  PduM74)  were  cloned 
into  PCRT7NT-TOPO  vector,  and  recombinant  proteins  were  expressed  in  E.  coli.  The 
affinity  column-purified  proteins  (lane  1,  PduM73;  lane2,  PduM74;  lane  3,  empty 
plasmid  vector)  were  analyzed  by  SDS-PAGE  and  then  subjected  to  (A)  western  blotting 
and  (B)  Coomassie  blue  staining. 
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Figure  11.  Enzymatic  activity  of  recombinant  P.  duboscqi  salivary  adenosine 
deaminase  (ADA). 

A  cuvette  containing  20  pM  adenosine  in  PBS  was  scanned  at  1.5  min  intervals 
for  15  min  following  addition  of  recombinant  ADA  (A,  PduM73;  B,  PduM74)  or 
supernatant  of  bacteria  transformed  with  control  plasmid  (C). 
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Lutzomyia  longipalpis,  Drosophila,  Aplysia  and  humans  [13].  This  sub-family  of  AD  As 
has  an  extended  N-tenninus  region  and  is  targeted  for  secretion.  These  data  suggest  that 
P.  duboscqi  acquired  this  activity  independent  of  other  Phlebotomus  sand  flies.  The 
question  remains  as  to  what  the  role  of  this  protein  is  in  blood-feeding.  It  was  previously 
speculated  that  the  activity  may  be  related  to  the  hydrolysis  of  adenosine,  an  important 
component  in  pain  perception  and  in  immunity  [3].  What  is  puzzling  is  that  other 
Phlebotomus  sand  flies  do  not  have  ADA  in  their  salivary  glands,  but  they  have  large 
amounts  of  adenosine  and  AMP,  very  active  vasodilators  and  platelet  inhibitors.  Neither 
adenosine  nor  AMP  was  present  in  the  saliva  of  P.  duboscqi,  as  demonstrated  in  this 
chapter.  Lutzomyia  longipalpis  also  lacks  adenosine  and  AMP  in  its  saliva;  however,  it 
has  maxadilan,  a  very  potent  vasodilator  [14].  Maxadilan  was  not  identified  in  the  P. 
duboscqi  cDNA  library  [11].  Therefore,  it  appears  that  P.  duboscqi  may  contain  a  novel 
vasodilator  that  will  replace  the  lack  of  vasodilatory  activities  exerted  by  AMP  and 
adenosine  in  other  Phlebotomus  sand  flies.  The  fact  that  ADA  is  present  only  in  P. 
duboscqi  and  not  in  other  Phlebotomus  sand  flies  examined  to  date  emphasizes  the  ability 
of  blood-feeding  arthropods  to  acquire  independent  strategies  to  overcome  or  modulate 
the  host  hemostatic,  inflammatory  and  immune  system.  ADA  has  an  important  role  in 
immunity  as  a  result  of  the  effects  of  adenosine,  2’-deoxyadenosine  and  the  hydrolytic 
product  of  these  compounds  [6].  Further  work  will  be  necessary  to  detennine  the  effect 
of  this  enzyme  in  parasite  transmission.  It  may  be  possible  that  this  enzyme  changes  the 
environment  in  the  skin  where  the  Leishmania  parasite  is  deposited  by  the  sand  fly. 
Inosine  is  the  primary  metabolite  of  adenosine  by  ADA.  Inosine  has  been  shown  to 
inhibit  the  production  of  pro  inflammatory  cytokines  including  TNF-a,  IL-1,  IL-12, 
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MIPl-a  and  INFy  in  stimulated  macrophages  and  spleen  cells  [15].  Additionally, 
adenosine  and  inosine  can  alter  cutaneous  vasopermeability  by  activating  A3  receptors  on 
mast  cells  [16].  Then,  it  may  be  possible  that  inosine  may  favor  a  Th2  environment  that 
will  benefit  parasite  establishment  in  the  skin  of  the  mammalian  host. 

Materials  and  methods 

Sand  flies  and  preparation  of  salivary  gland  homogenate  (SGH) 

A  Phlebotomus  duboscqi  Theodor,  Mali  strain,  were  reared  using  a  mixture  of 
fermented  rabbit  food  and  rabbit  feces  as  larval  food.  Adult  sand  flies  were  offered  a 
cotton  swab  containing  20%  sucrose  and  were  used  for  dissection  of  salivary  glands  at  5— 
7  days  after  emergence.  Salivary  glands  were  stored  in  groups  of  10  pairs  in  10  pi 
phosphate-buffered  saline  (PBS).  Salivary  glands  were  disrupted  by  ultrasonication  in 
1.5  ml  conical  tubes.  Tubes  were  centrifuged  at  10,000  g  for  2  min  and  the  resultant 
supernatant  used  for  the  studies. 

Salivary  gland  cDNA  library 

The  P.  duboscqi  salivary  gland  cDNA  library  was  made  as  previously  described 
[11].  Briefly,  mRNA  was  isolated  from  55  salivary  gland  pairs  using  the  Micro- 
FastTrack  mRNA  isolation  kit  (Invitrogen,  San  Diego,  CA,  USA).  The  PCR-based 
cDNA  library  was  made  following  the  instructions  for  the  SMART  cDNA  library 
construction  kit  (BD-Clontech,  Palo  Alto,  CA,  USA)  with  some  modifications  [11].  The 
P.  duboscqi  cDNA  library  was  sequenced  as  previously  described  using  an  Applied 
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Biosystems  3730x1  DNA  Analyzer  (Foster  City,  CA,  USA)  and  a  CEQ  2000XL  DNA 
sequencing  instrument  (Beckman  Coulter,  Fullerton,  CA,  USA)  [11]. 

Phylogenetic  analysis 

Consensus  protein  sequences  were  compared  to  related  sequences  from  sand  flies 
as  well  as  non-sand  fly  species  obtained  from  GenBank.  Sequences  were  aligned  using 
ClustalX  and  manually  refined  using  BioEdit  sequence  editing  software 
(http://www.mbio.ncsu.edu/BioEdit/page2.html)  [17].  Phylogenetic  analysis  was 
conducted  on  protein  alignments  using  Tree  Puzzle  version  5.2  [18].  Tree  Puzzle 
constructs  phylogenetic  trees  by  maximum  likelihood,  using  quartet  puzzling, 
automatically  estimating  internal  branch  node  support  (1000  replications).  Derived  trees 
were  visualized  using  MEGA  (Molecular  Evolutionary  Genetics  Analysis)  version  3.1 
(http://www.megasoftware.net/)  [19]. 

Enzymatic  assays 

Measurement  of  activity  was  performed  in  quartz  microcuvettes  using  60  pi 
samples  (Starna  Cells,  Atascadero,  CA,  USA).  20  pM  adenosine  in  PBS  was  added  to 
the  cuvette,  followed  by  addition  of  the  enzyme  source.  After  mixing  the  solution  by 
pipetting,  the  absorbance  between  220  and  300  nm  was  monitored  at  1.5  or  3.0  min 
intervals  using  a  Lambda  18  spectrophotometer  from  Perkin  Elmer  (Norwalk,  CT,  USA). 
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Molecular  sieving-high-performance  liquid  chromatography 

Molecular  sieving-high-performance  liquid  chromatography  (MS-HPLC)  was 
carried  out  using  a  Dionex  Summit  system  and  Chromeleon  software  (Dionex, 

Sunnyvale,  CA,  USA).  For  analysis,  20  pi  of  sample  was  applied  to  a  Superdex  Peptide 
PC  3.2/30  column  (Amersham  Biosciences,  Piscataway,  NJ,  USA)  using  10  mMNaPCfi, 
150  mMNaCl,  pH  6.5  as  the  mobile  phase  at  a  flow  rate  of  150  pi  min  for  30  min  of 
separation.  Detection  was  performed  using  a  photodiode  array  detector  and  a  3-D 
chromatogram  generated  using  Chromeleon  software.  Adenosine  and  AMP  standards 
(500  pmol)  were  applied  separately.  Single  pairs  of  salivary  glands  from  P.  papatasi  and 
P.  duboscqi  were  sonicated  in  PBS,  clarified  by  centrifugation  and  applied  to  HPLC. 

Expression  of  P.  duboscqi  ADA 

DNA  fragments  encoding  mature  P.  duboscqi  ADA  protein  were  amplified  and 
inserted  into  the  cloning  site  of  the  pCRT7/NT-TOPO  vector  (Invitrogen).  The  primers 
used  for  PCR  amplification  of  the  mature  ADA  encoding  fragments  were 
PDBLP02G04VF  (5’-GTTTTGGACATTTCGAACATTA-3')  and 
PDBLP02G09VR  (5 ’-T GGCTCC AA AT GATTC AGAC A-3 ’)  for  2G4  (PduM73)  and 
PDBL  P02  G09  VF  (5  ’-CTTT GAAAATT AAACCGAAACGA-3  ’  and 
PDBLP02G09VR  for  2G9  (PduM74).  Escherichia  coli  strain  BL21(DE3)pLysS  cells 
(Invitrogen)  were  transformed  with  the  recombinant  plasmid  and  grown  in  LB  broth 
containing  ampicillin  (50  pg/ml).  Production  of  recombinant  protein  was  induced  by 
addition  of  IPTG  to  a  final  concentration  of  1  and  at  27°C  for  3  h.  The  recombinant 
protein  was  purified  from  the  supernatant  of  bacterial  sonic  lysate  using  a  MagneHis 
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Protein  Purification  System  (Promega,  Madison,  WI,  USA)  and  dialysed  with  Centricon 
Plus-20  (Millipore,  Bedford,  MA,  USA)  to  remove  imidazole  from  the  elution  buffer 
before  further  enzymatic  analysis. 

Sodium  dodecyl  sulfate  (SDS)-polyacrylamide  gel  electrophoresis  (PAGE)  and 
western  blotting 

The  samples  were  treated  with  NuPAGE  LDS  sample  buffer  (Invitrogen)  and 
analyzed  on  NuPAGE  10%  Bis-Tris  gels  (Invitrogen)  with  NuPAGE  MES  SDS  running 
buffer  (Invitrogen).  To  estimate  the  molecular  mass  of  the  samples,  SeeBlue  markers 
from  Invitrogen  (myosin,  bovine  serum  albumin,  glutamic  dehydrogenase,  alcohol 
dehydrogenase,  carbonic  anhydrase,  myoglobin,  lysozyme,  aprotinin,  and  insulin,  chain 
B)  were  used.  After  electrophoresis,  the  gels  were  stained  with  SimplyBlue™  SafeStain 
CoomassieR  (Invitrogen)  or  SilverQuest™  Silver  Staining  (Invitrogen).  For  the  western 
blotting,  the  proteins  in  the  gel  were  transferred  to  nitrocellulose  membrane  (Invitrogen) 
using  NuPAGE  transfer  buffer  (Invitrogen).  After  blocking  with  5%  milk  in  Tris- 
buffered  saline  containing  0.1%  Tween-20,  pH  8.0  (TBST),  the  membrane  was  incubated 
with  alkaline  phosphatase  (AP)-conjugated  anti-His6/G  antibody  (Invitrogen)  for  1  h  at 
room  temperature.  After  three  washes  with  TBST,  the  blots  were  developed  by  addition 
of  5-bromo-4-chloro-3-indolyl-l -phosphate  and  nitro  blue  tetrazolium  for  visualization. 
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Abstract 

Background:  In  sand  flies,  the  blood  meal  is  responsible  for  the  induction  of 
several  physiologic  processes  that  culminate  in  egg  development  and  maturation.  During 
blood-feeding,  infected  sand  flies  also  are  able  to  transmit  the  parasite  Leishmania  to  a 
suitable  host.  Many  blood-induced  molecules  play  significant  roles  during  Leishmania 
development  in  the  sand  fly  midgut,  including  parasite  killing  within  the  endoperitrophic 
space.  In  this  work,  we  randomly  sequenced  transcripts  from  three  distinct  high  quality 
full-length  female  Phlebotomus  papatasi  midgut-specific  cDNA  libraries  from  sugar-fed, 
blood-fed  and  Leishmania  major- infected  sand  flies.  Furthennore,  we  compared  the 
transcript  expression  profiles  from  the  three  different  cDNA  libraries  by  customized 
bioinformatics  analysis  and  validated  these  findings  by  semi-quantitative  PCR  and  real¬ 
time  PCR. 

Results:  Transcriptome  analysis  of  4010  cDNA  clones  resulted  in  the 
identification  of  the  most  abundant  P.  papatasi  midgut-specific  transcripts.  The 
identified  molecules  included  those  with  putative  roles  in  digestion  and  peritrophic 
matrix  formation,  among  others.  Moreover,  we  identified  sand  fly  midgut  transcripts  that 
are  expressed  only  after  a  blood  meal,  such  as  microvilli  associated-like  protein 
(. PpMVPl ,  PpMVP2  and  PpMVP3),  a  peritrophin  ( PpPerl ),  trypsin  4  ( PpTryp4 ) , 
chymotrypsin  PpChym2,  and  two  unknown  proteins.  Of  interest,  many  of  these 
overabundant  transcripts  such  as  PpChym2,  PpMVPl,  PpMVP2,  PpPerl  and  PpPer2 
were  of  lower  abundance  when  the  sand  fly  was  given  a  blood  meal  in  the  presence  of  L. 


major. 
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Conclusion:  This  tissue-specific  transcriptome  analysis  provides  a  comprehensive 
look  at  the  repertoire  of  transcripts  present  in  the  midgut  of  the  sand  fly  P.  papatasi. 
Furthermore,  the  customized  bioinformatic  analysis  allowed  us  to  compare  and  identify 
the  overall  transcript  abundance  from  sugar-fed,  blood-fed  and  Leishmania- infected  sand 
flies.  The  suggested  upregulation  of  specific  transcripts  in  a  blood-fed  cDNA  library 
were  validated  by  real-time  PCR,  suggesting  that  this  customized  bioinformatic  analysis 
is  a  powerful  and  accurate  tool  useful  in  analyzing  expression  profiles  from  different 
cDNA  libraries.  Additionally,  the  findings  presented  in  this  work  suggest  that  the 
Leishmania  parasite  is  modulating  key  enzymes  or  proteins  in  the  gut  of  the  sand  fly  that 
may  be  beneficial  for  its  establishment  and  survival. 

Background 

Cutaneous  leishmaniasis  due  to  L.  major  is  found  throughout  the  Old  World, 
including  the  Middle  East  and  West  Africa.  Phlebotomus  papatasi  is  the  principal  vector 
for  this  parasite  and  is  refractory  to  the  development  of  other  species  of  Leishmania. 

Upon  taking  a  blood  meal,  hematophagous  arthropods  express  a  large  number  of 
molecules  that  participate  in  various  physiologic  processes  ranging  from  blood  digestion 
to  egg  development.  Furthermore,  many  insects  can  either  obtain  or  transmit  pathogens 
during  the  acquisition  of  a  blood  meal.  In  blood-feeding  arthropods,  the  midgut  plays  a 
crucial  role  as  the  primary  organ  involved  in  processing  the  blood  meal  and,  in  some 
instances,  molecules  expressed  in  the  midgut  of  an  insect  vector  have  been  shown  to 
directly  influence  pathogen  establishment  [1,2].  Certain  pathogens,  such  as  Leishmania, 
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appear  able  to  modulate  the  activity  of  sand  fly  midgut  proteases  for  their  own  benefit  or 
survival  [3,  4]. 

Sequenced  data  sets  containing  information  regarding  expression  profiles  of 
anopheline  and  culicine  mosquitoes,  such  as  Anopheles  gambiae  and  Aedes  aegvpti, 
following  a  blood  meal  have  become  available  [5,6].  Other  datasets  now  encompass 
insects  such  as  Pediculus  humanus  [7]  and  Cidicoides  sonorensis  [8],  In  comparison, 
transcriptome  information  regarding  sand  flies  is  limited.  Previous  work  has  focused 
mainly  on  the  sand  fly  salivary  gland  [9-11];  whereas,  only  a  small  number  of  sand  fly- 
specific  midgut  cDNAs  have  been  identified  [12-16].  Recently,  a  large  set  of  cDNA 
transcripts  from  the  whole  sand  fly  Lutzomyia  longipalpis  has  been  sequenced,  providing 
greater  information  regarding  molecules  present  in  sand  flies  [17].  However,  the 
information  regarding  sand  fly  midgut-specific  transcripts  remains  poor. 

In  this  work,  we  embarked  on  a  comprehensive  study  of  P.  papatasi  midgut- 
specific  transcripts  and  compared  the  expression  profile  of  these  transcripts  by  directly 
comparing  those  obtained  from  midguts  of  females  fed  on  sugar  only,  on  blood  or  on 
blood  containing  L.  major.  With  this  approach,  we  have  identified  several  P.  papatasi 
midgut-specific  transcripts  that  are  differentially  expressed  after  a  blood  meal  and  in  the 
presence  of  L.  major. 


Results  and  discussion 


The  midgut  is  the  tissue  where  Leishmania  development  takes  place  while  within 
its  sand  fly  vector.  Within  the  midgut  environment,  Leishmania  possibly  interacts  with 
various  secreted  molecules  and  cell  types  lining  the  midgut  epithelia.  In  order  to  gain 
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greater  insight  into  the  repertoire  of  the  proteins  present  in  the  midgut  of  P.  papatasi,  we 
constructed  and  sequenced  three  high  quality  full-  length  cDNA  libraries  from  the  midgut 
of  sand  flies  fed  either  on  sugar  only  (unfed),  blood  or  blood  containing  L.  major.  A  total 
of  4010  high  quality  sequenced  clones  obtained  from  the  three  cDNA  libraries  were 
combined  and  analyzed,  resulting  in  the  formation  of  1382  clusters.  Each  cluster  may 
contain  a  large  number  of  transcripts,  which  creates  a  contig  (high  quality  consensus 
sequence)  or  may  have  a  single  transcript  that  can  be  defined  as  a  singleton.  Therefore, 
we  will  utilize  the  nomenclature  of  “cluster”  in  the  remainder  of  the  manuscript  to  define 
either  a  consensus  sequence  from  various  transcripts  or  a  singleton. 

Consensus  sequences  were  compared  with  various  databases,  and  putative 
functions  were  assigned.  The  categories  for  the  transcripts’  potential  biologic  functions 
included  protein  synthesis  machinery,  protein  modification  machinery,  transcription 
machinery,  transporters,  extracellular  matrix,  signal  transduction,  immunity,  adhesion, 
and  conserved  proteins  of  unknown  function.  Table  3  summarizes  this  analysis  listing 
transcripts  from  female  P.  papatasi  midguts  fed  on  sugar,  on  blood,  and  on  blood 
containing  L.  major.  The  first  column  shows  the  putative  biological  function,  the  first 
section  of  columns  shows  the  number  of  clusters  found  in  each  of  the  three  cDNA 
libraries  in  relation  to  this  function.  The  second  section  of  columns  indicates  the  total 
number  of  sequences  for  these  clusters,  and  the  third  section  of  columns  shows  the 
average  of  the  number  of  sequences  per  cluster.  The  category  of  “conserved  unknown 
function”  had  the  largest  number  of  clusters  in  all  three  of  the  cDNA  libraries.  These 
were  followed  by  metabolism,  energy  in  the  sugar-fed  library  (95  clusters);  metabolism, 
amino  acid,  which  includes  digestive  enzymes,  in  the  blood  meal  library  (40  clusters); 
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and  protein  synthesis  machinery  in  the  L.  major  blood-meal  library  (5 1  clusters).  The 
categories  with  the  highest  number  of  sequences  per  cluster  differed  between  the  three 
cDNA  libraries  and  was  highest  among  transcripts  identified  as  extracellular  matrix 
(27.33  seq/cluster)  in  the  sugar-fed  cDNA  library  and  cytoskeletal  transcripts  for  both  the 
blood  meal  (19.40  seq/cluster)  and  L.  major  blood  meal  cDNA  libraries  (15.00 
seq/cluster).  The  sugar- fed  cDNA  library  has  669  clusters  with  an  average  of  3.23 
sequences  per  cluster.  The  cDNA  library  constructed  from  blood-fed  midguts  consisted 
of  441  clusters,  with  an  average  of  3.27  sequences  per  cluster.  Of  P.  papatasi  midgut  fed 
on  blood  containing  L.  major,  this  library  produced  555  clusters,  with  an  average  of  3.01 
sequences  per  cluster. 

The  number  of  sequences  in  each  category  for  the  three  cDNA  libraries  is 
graphically  represented  in  Figure  12.  After  blood- feeding,  there  is  a  decrease  in  the 
number  of  sequences  in  all  categories  other  than  cytoskeletal,  amino  acid  metabolism, 
and  heme  metabolism.  Noticeable  differences  in  the  number  of  sequences  between  the 
blood-fed  and  blood-fed  containing  L.  major  libraries  occurs  in  the  protein  synthesis 
machinery,  extracellular  matrix,  cytoskeletal,  heme  metabolism,  and  conserved  of 
unknown  function  categories. 

Table  4  gives  a  more  detailed  description  of  the  different  types  of  transcripts 
identified  in  the  combined  analysis  of  the  three  cDNA  libraries.  Only  high  quality 
sequences  and,  for  the  most  part,  full-length  coding  sequences  submitted  to  GenBank  are 
shown.  This  table  shows  the  different  clusters  arranged  in  the  order  of  cluster  number  in 
the  combined  analysis  of  the  three  cDNA  libraries.  The  first  column  of  Table  4  describes 
the  cluster  number,  the  second  column  shows  the  clone  that  produced  the  full-length 


Putative  biological  function 


60 


protein  synthesis  machinery 
protein  modification  machinery 
protein  export  machinery 
transcription  machinery 
transcription  factors 
proteasome  machinery 
transporters 
extracellular  matrix 
cytoskeletal 
signal  transduction 
protease  inhibitor 
immunity 
adhesion 

nuclear  metabolism  and  regulation 
metabolism,  energy 
metabolism,  lipid 
metabolism,  carbohydrate 
metabolism,  amino  acid 
metabolism,  nucleic  acid  and  nucleotides 
metabolism,  heme 
conserved  of  unknown  function 
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Figure  12.  Distribution  of  sequences  analyzed  from  each  cDNA  library  separated  by 
putative  biologic  function. 
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Table  4:  Clusters  of  combined  P.  papatasi  midgut  cDNA  libraries  (sugar-fed,  blood-fed 
and  Leishmania  major  -infected)  of  transcripts  with  high  quality  sequences 


Cluster 

Clone 

NCBI  best  match  to  NR  database 

E  value 

Putative  function 

GenBank 

Cluster  1 

B05_ppmgbl_p23 

microvilli  membrane  protein  [A.  aegypti] 

1.0E-47 

Microvilli  protein 

EU031911 

Cluster  2 

F05_ppmgbl_p23 

microvilli  membrane  protein  [A.  aegypti] 

1.0E-47 

Microvilli  protein 

EU031911 

Cluster  3 

PPMGBM189 

microvilli  membrane  protein  [A.  aegypti] 

7.0E-48 

Microvilli  protein 

EU03191 1 

Cluster  9 

PPMGBL17 

LP07759p  [D.  melanogaster ] 

8.0E-25 

Peritrophin 

EU031912 

Cluster  10 

A05PPMGBSP28 

hypothetical  protein  17  [L.  obliqua] 

3.0E-44 

40S  ribosomal  S30  protein 

EU040041 

Cluster  1 1 

A07PPMGBSP28 

ENSANGP00000028746  [A.  gambiae] 

1.0E-33 

Unknown 

EU040042 

Cluster  12 

PPMGM173 

similar  to  CG4778-PA  [T.  castaneum ] 

8.0E-11 

Peritrophin 

EU047543 

Cluster  13 

A02PPINFLP30 

similar  to  CG4778-PA  [T.  castaneum] 

7.0E-1 1 

Peritrophin 

EU047543 

Cluster  15 

H08PPMGBLP31 

chymotrypsin  [P.  papatasi] 

1.0E-154 

Chymotrypsin 

AAM96939 

Cluster  16 

E10_ppmgbm_p21 

carboxypeptidase  B  [A.  aegypti] 

4.0E-67 

Carboxypeptidase 

EU047544 

Cluster  1 7 

C07PPMGSP24 

ribosomal  protein  S20  [B.  mori] 

9.0E-57 

40S  ribosomal  protein  S20 

EU047545 

Cluster  18 

H10_ppmgm_p22 

trypsin  1  [P.  papatasi] 

1.0E-151 

Trypsin 

AAM96940 

Cluster  20 

H05_ppmgm_p22 

CG32276-PB,  isoform  B  [D.  melanogaster] 

2.0E-23 

Ribosome  associated  membrane  protein 

EU047546 

Cluster  21 

PPMGBS47 

Ribosomal  protein  LI  9  [D.  melanogaster] 

1.0E-100 

60s  ribosomal  protein  LI  9 

EU047547 

Cluster  23 

D05PPMGLP28 

trypsin  2  [P.  papatasi] 

1.0E-157 

Trypsin 

AAM96941 

Cluster  24 

G04PPMGMP25 

RE59709p  [D.  melanogaster] 

7.0E-68 

60S  ribosomal  protein  L32 

EU047548 

Cluster  25 

PPMGSP31B06 

similar  to  D.  melanogaster  CG3203  [D.  yakuba] 

1.0E-91 

60S  ribosomal  protein  LI  7 

EU045355 

Cluster  26 

PPMGLP29D06 

peritrophin-like  protein  1  [C.  felis] 

2.0E-36 

Peritrophin 

EU045354 

Cluster  29 

PPINFM-P7-G10 

Ribosomal  protein  L29  [D.  melanogaster] 

4.0E-24 

60S  ribosomal  protein  L29 

EU045353 

Cluster  31 

B03PPMGMP25 

CGI  3551  [D.  melanogaster] 

3.0E-37 

Unknown 

EU049582 

Cluster  32 

H04PPMGLP23 

LD17235p  [D.  melanogaster] 

1.0E-93 

60S  ribosomal  protein  L1 1 

EU045352 

Cluster  34 

PPMGM152 

similar  to  D.  melanogaster  RpL14  [D.  yakuba] 

5.0E-50 

60S  ribosomal  protein  LI 4 

EU045351 

Cluster  35 

E07_ppmgs_p21 

60S  acidic  ribosomal  protein  PI  [S.  frugiperda] 

1.0E-45 

60s  Acidic  ribosomal  protein  PI 

EU045350 

Cluster  37 

PPMGL197 

ENSANGP0000001 9623  [A.  gambiae] 

4.0E-63 

Astacin 

EU045349 

Cluster  40 

F01PPINFLP30 

S7  ribosomal  protein  [C.  pipiens  quinquefasciatus] 

3.0E-89 

40S  ribosomal  protein  S7 

EU045348 

Cluster  73 

A03PPPMGBMP25 

unknown  [C.  sonorensis] 

4.0E-25 

Unknown 

EU045347 

Cluster  75 

C04_ppmgs_p21 

similar  to  D.  melanogaster  qm  [D.  yakuba] 

1.0E-1 17 

60s  ribosomal  protein  L10 

EU045346 

Cluster  89 

B1 1PPINFLP32 

trypsin  4  [P.  papatasi] 

1.0E-129 

Trypsin 

AAM96943 

Cluster  94 

F08PPINFMP22 

microvilli  membrane  protein  [A.  aegypti] 

2.0E-35 

Microvilli  protein 

EU047549 

Cluster  96 

C05_ppmgm_p22 

Cr-PII  [P.  americana] 

3.0E-20 

Microvilli  protein 

EU 047550 

Cluster  98 

B12_ppmgbl_p23 

ENSANGP0000001 7713  [A.  gambiae] 

1.0E-17 

Microvilli  protein 

EU047551 

Cluster  99 

A03_ppmgbl_p20 

hypothetical  protein  [T.  castaneum] 

2.0E-23 

Unknown 

EU045345 

Cluster  103 

H03PPMGLP23 

GA1 31 79-PA  [D.  pseudoobscura] 

2.0E-43 

Ferritin 

EU045344 

Cluster  1 06 

H07_ppmgm_p22 

TPA  inf:  HDC07203  [D.  melanogaster] 

8.0E-14 

Unknown 

EU045343 

Cluster  1 1 1 

PPMGLP34H09 

GA1 6408-PA  [D.  pseudoobscura  ] 

2.0E-10 

Kazal  type  serine  protease  inhibitor 

EU045342 

Cluster  113 

B10PPINFLP21 

carboxypeptidase  A  [A.  aegypti] 

2.0E-87 

Carboxypeptidase 

EU045341 

Cluster  119 

PPMGLP29B04 

midgut  specific  galectin  [P.  papatasi] 

1.0E-145 

Galectin 

AAT11557 

Cluster  122 

C08PPINFLP31 

GA1 5307-PA  [D.  pseudoobscura] 

3.0E-62 

Ferritin 

EU045340 

Cluster  125 

D06PPMGLP23 

Glutathione  S-transferase  [M.  domestica] 

3.0E-86 

Glutathione  S-transferase 

EU045339 

Cluster  126 

PPMGBL175 

10  kDa  salivary  protein  [P.  ariasi] 

6.0E-07 

Unknown 

EU045338 

Cluster  127 

H07_ppmgs_p21 

ribosomal  protein  S8  [A.  albopictus] 

3.0E-94 

40S  ribosomal  protein  S8 

EU045337 

Cluster  128 

C02_ppmgs_p21 

60S  acidic  ribosomal  protein  P2  [A.  aegypti] 

8.0E-39 

60S  acidic  ribosomal  protein  P2 

EU045336 

Cluster  129 

PPMGLP29C01 

ENSANGP0000001 6569  [A.  gambiae] 

5.0E-40 

membrane  LPS  inducible  TNF  protein 

EU035828 

Cluster  134 

C01PPINFMP22 

similar  to  D.  melanogaster  RpS18  [D.  yakuba] 

9.0E-70 

40S  ribosomal  protein  SI 8 

EU035827 

Cluster  135 

D04PPINFLP31 

trypsin  3  [P.  papatasi] 

1.0E-144 

Trypsin 

AAM96942 

Cluster  1 39 

PPMGSP31A1 1 

CG30415-PB,  isoform  B  [D.  melanogaster] 

1.0E-27 

Unkown 

EU035823 

Cluster  146 

PPMGM132 

similar  to  CG2998-PA  [T.  castaneum] 

2.0E-27 

40S  ribosomal  protein  S28 

EU035822 

Cluster  147 

A06PPMGMP25 

Ribosomal  protein  L23  [D.  melanogaster] 

5.0E-73 

60S  ribosomal  protein  L23 

EU035821 

Cluster  149 

B06PPMGMP25 

similar  to  D.  melanogaster  RpS12  [D.  yakuba] 

1.0E-64 

40S  ribosomal  protein  S12 

EU035820 

Cluster  150 

PPMGLP29D08 

ENSANGP0000002101 1  [A.  gambiae] 

1.0E-111 

Unkown 

EU035819 

Cluster  153 

B07PPMGSP24 

similar  to  D.  melanogaster  CG2033  [D.  yakuba] 

3.0E-67 

40S  ribosomal  protein  SI 5 

EU035818 

Cluster  158 

D06PPINFLP21 

ENSANGP0000001 3724  [A.  gambiae] 

5.0E-70 

Ryanodine  receptor 

EU035817 

Cluster  163 

PPMGM133 

cyclophylin  isoform  [A.  aegypti] 

8.0E-82 

Cyclophilin 

EU032351 

Cluster  165 

A1 1PPMGMP25 

60S  ribosomal  protein  L40  [A.  albopictus] 

4.0E-68 

Ubiquitin  /  ribosomal  L40  fusion 

EU032350 

Cluster  167 

PPMGL276 

similar  to  ENSANGP00000002356  [A.  mellifera] 

2.0E-66 

Na+/K+  ATPase 

EU032348 

Cluster  168 

F05_ppmgbm_p21 

chymotrypsin  [P.  papatasi] 

1.0E-147 

Chymotrypsin 

AAM96938 

Cluster  171 

G12_ppmgs_p21 

GA1 6582-PA  [D.  pseudoobscura  ] 

8.0E-78 

60S  ribosomal  protein  LI 2 

EU032349 

Cluster  1 74 

A06_ppmgm_p22 

similar  to  D.  melanogaster  CG2099  [D.  yakuba] 

1.0E-54 

60S  ribosomal  protein  L35A 

EU032347 

Cluster  176 

A03_pppmgbl_p24 

translation  factor  SUII-like  protein  [A.  aegypti] 

1.0E-54 

Translation  initiation  factor  1 

EU032346 

Cluster  177 

A08PPMGBSP28 

GA1 07 14-PA  [D.  pseudoobscura] 

5.0E-92 

ADP  ribosylation  factor 

EU032345 

Cluster  1 82 

PPMGSP31E04 

similar  to  D.  melanogaster  CGI 0423  [D.  yakuba] 

2.0E-38 

40s  ribosomal  protein  S27 

EU032344 

Cluster  1 83 

D12_ppmgm_p22 

ENSANGP000000267 1 8  [A.  gambiae] 

1.0E-61 

Cytochrome  C  oxidase  subunit  IV 

EU032343 

Cluster  1 84 

C02PPINFLP30 

unknown  [C.  sonorensis] 

4.0E-21 

Unknown 

EU032342 

Cluster  1 85 

D04PPINFMP27 

cytochrome  b  [P.  papatasi] 

1.0E-1 10 

Cytochrome  B 

AF161214 

Cluster  186 

MGL69 

similar  to  CG9916-PA  isoform  1  [T.  castaneum] 

4.0E-81 

Cyclophilin 

EU032341 

Cluster  187 

PPINFM-P7-F11 

ribosomal  protein  S17e  [Eucinetus  sp.  ] 

9.0E-31 

40S  ribosomal  protein  SI 7 

EU032340 

Cluster  188 

H01PPMGLP23 

10  kDa  salivary  protein  [P.  ariasi] 

2.0E-06 

Unknown 

EU032339 

Cluster  201 

G1 1PPPMBGMP26 

larval  chymotrypsin-like  protein  [A.  aegypti] 

5.0E-82 

Chymotrypsin 

EU035826 

Cluster  228 

PPMGM508 

peroxiredoxin-like  protein  [A.  aegypti] 

1.0E-69 

Peroxiredoxin 

EU035825 

Cluster  232 

PPMGBL92 

glutathione  S-transferase  [A.  aegypti] 

4.0E-59 

Glutathione  S-transferase 

EU035824 

Cluster  243 

D11_pppmgbl_p24 

midgut  chitinase  [P.  papatasi] 

1.0E-140 

chitinase 

AAV49322 
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sequence,  the  third  column  shows  the  best  match  in  the  non-redundant  protein  database 
(GenBank,  NCBI),  the  fourth  column  shows  the  E-value  for  the  best  matching  BLAST 
result  in  column  3,  the  fifth  column  shows  the  assigned  putative  function  of  that  cluster, 
and  the  sixth  column  shows  the  accession  number  of  the  transcript  submitted  to 
GenBank.  The  four  most  abundant  transcripts  were  microvilli-associated  like  protein, 
followed  by  peritrophin-like  protein,  40  S  ribosomal  protein  S30  and  a  transcript  coding 
for  a  protein  of  unknown  function.  Still,  other  abundant  transcripts  include  those  coding 
for  various  ribosomal  proteins,  chymotrypsins,  carboxypeptidases,  trypsins,  a  zinc 
metalloprotease  astacin,  a  Kazal-type  serine  protease  inhibitor,  Glutathione  S-transferase 
(GST)  and  various  proteins  of  unknown  function  (Table  4).  All  the  sequences  generated 
from  these  three  cDNA  libraries  have  been  deposited  as  an  EST  database  at  the  National 
Center  of  Biological  Infonnation  (NCBI),  accession  numbers  ES346912  -  ES351350  and 
ES35 1429).  The  following  is  a  more  detailed  description  of  relevant  transcripts 
identified  in  the  cDNA  libraries: 

Microvilli-associated  like  proteins 

Of  the  most  abundant  transcripts  found  in  the  combined  analysis  of  all  three 
libraries  were  transcripts  coding  for  proteins  with  similarities  to  microvilli  membrane 
proteins  from  A.  aegypti  and  A.  gambiae.  These  transcripts  are  also  homologous  to  major 
allergens  identified  in  the  cockroaches  Blatella  germanica  and  Periplaneta  americana 
[18]  and  to  a  nitrile-specifier  protein  (PrNSP)  from  the  midgut  of  Pieris  rapae.  PrNSP 
has  a  role  of  converting  toxic  compounds,  such  as  isothiocyanate,  into  less  toxic 
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compounds,  such  as  nitriles,  that  are  excreted  in  the  feces  of  larval  stages  of  this 
Lepidopteran  [19]. 

Four  different  putative  microvilli-associated  proteins  were  identified  in  the  three 
P.  papatasi  midgut  cDNA  libraries  (Figure  13).  Clusters  1,  2,  and  3  represent  likely 
polymorphisms  of  the  same  transcript  named  here  “microvilli  protein  1”  ( PpMVPl ), 
which  has  a  predicted  molecular  weight  of  23.7  kDa.  Another  three  transcripts  coding  for 
microvilli  proteins  and  derived  from  clusters  94,  96  and  98  were  named  PpMVP2, 
PpMVP3,  and  PpMVP4,  respectively.  The  predicted  molecular  weight  for  these 
microvilli-associated  like  proteins  is  24.0,  25.6,  and  25.6  kDa,  respectively. 

Additionally,  each  of  these  microvilli  proteins  has  a  potential  signal  peptide  as  predicted 
by  SignalP  3.0  and  no  evidence  of  transmembrane  helices  as  predicted  using  the 
TMHMM  2.0  server.  Identity  between  the  amino  acid  sequences  of  these  microvilli 
proteins  ranges  from  21  to  36  percent  (Figure  13,  black-shaded  amino  acids)  and 
similarity  from  45  to  57  percent  (Figure  13,  grey-shaded  amino  acids).  The  degree  of 
conservation  may  indicate  that  these  are  biochemically  distinct  from  one  another  and  only 
commonly  named  based  on  the  previous  annotation  of  other  organisms  with  similar 
sequences.  Searching  the  translated  assembled  sequences  from  an  EST  database  of  Lu. 
longipalpis  identified  NSFM-139c08,  NSFM-18hl  1,  NSFM-68e08,  and  NSFM-47h07  as 
having  high  sequence  homology  to  the  microvilli-associated  like  proteins  PpMVPl, 
PpMVP2,  PpMVP3,  and  PpMVP4,  respectively  [17]. 
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Figure  13.  Multiple  sequence  alignment  of  the  four  putative  microvilli  associated-like 


proteins  found  in  the  midgut  of  Phlebotomus  papatasi. 


Predicted  signal  peptide  sequence  is  underlined  and  the  accession  numbers  given 


in  parentheses. 
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Peritrophin-like  proteins 

Transcripts  coding  for  three  different  putative  peritrophin-like  molecules  were 
identified  in  the  midgut  of  P.  papatasi.  PpPerl  (cluster  9)  and  PpPer2  (clusters  12  and 
13)  transcripts  code  for  secreted  proteins  with  predicted  molecular  masses  of  29.8  and  9.6 
kDa,  respectively.  PpPerl  is  comprised  of  four  potential  chitin-binding  peritrophin-A 
domains  (Figure  14A).  PpPer2  is  a  much  smaller  predicted  protein  and  has  only  one 
potential  chitin-binding  domain  (Figure  14A).  A  third  putative  peritrophin,  PpPer3,  was 
identified  from  cluster  26  with  an  apparent  molecular  mass  of  approximately  32  kDa 
(Figure  14A)  and  contains  two  distant  putative  chitin-binding  domains.  Phylogenetic 
analysis  using  the  chitin  binding  domains  of  PpPerl,  Pper2,  PpPer3  and  those  of 
peritrophins  from  several  insects  (Figure  14B)  suggests  a  low  level  of  conservation 
between  the  domains.  Insect  peritrophins  have  been  reported  to  bind  to  chitin  fibers  via 
multiple  chitin-binding  domains,  fonning  the  scaffold  that  maintains  the  molecular 
structure  of  the  peritrophic  matrix  (PM)  in  the  mosquito  gut  [20].  In  addition  to  their  role 
in  the  formation  of  the  PM,  peritrophins  may  also  play  a  role  in  preventing  the  toxic 
effects  of  heme,  a  bi-product  of  blood  meal  digestion.  In  Ae.  aegypti,  AelMUCl,  a 
mucin  that  encodes  putative  chitin-binding  domains  was  recently  shown  to  bind  heme 
[21].  Although  peritrophins  have  been  characterized  from  several  insects,  including  Ae. 
aegypti  and  An.  gambiae  [20,  22],  no  infonnation  exists  related  to  sand  fly  midgut- 
specific  peritrophins.  PpPerl  and  PpPer3  have  high  sequence  similarity,  at  the  protein 
level,  to  the  translated  sequences  SFM-03c06  and  SFM-02h07  of  the  Lu.  longipalpis  EST 
database.  However,  PpPer2  has  lower  sequence  similarity  to  any  assembled  and 
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-  PpPer2D1  (EU047543) 


-  Ae)MUC1D3  (AF 308863) 

-  PpPer3D2  (EU045354) 
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Figure  14.  Characterization  of  peritrophin  sequences. 

(A)  Diagrammatic  representation  of  Phlebotomus  papatasi  peritrophin-like 
molecules  showing  the  predicted  signal  peptide  and  chitin-binding  domains.  (B) 
Phylogenetic  analysis  of  chitin-binding  domains  of  peritrophin  molecules  from  Aedes 
aegypti  (Ae),  Anopheles  gambiae  (Ag)  Ctenocephalides  felis  (Cf),  Lucilia  cuprina  (Luc), 
Phlebotomus  papatasi  (Pp),  Lutzomyia  longipalpis  (LI).  Accession  numbers  are 
indicated  in  parentheses  and  bootstrap  values  at  the  nodes. 
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translated  sequences  of  the  Lu.  longipalpis  database,  suggesting  a  more  divergent  or 
novel  molecule. 

Trypsin 

Among  the  most  abundant  transcripts  in  the  cDNA  libraries  were  the  previously 
characterized/5,  papatasi  trypsin-like,  PpTrypl,  (Cluster  18  with  158  sequences),  and 
PpTiyp4  (Cluster  89  with  1 14  sequences)  [13].  PpTryp2  (Cluster  23)  and  PpTiyp3 
(Cluster  135)  were  less  abundant  with  12  and  8  sequences,  respectively.  Phylogenetic 
analysis  of  trypsins  from  P.  papatasi  and  from  other  organisms  resulted  in  the  formation 
of  two  major  clades,  each  supported  by  maximum  likelihood  analysis  (Figure  15).  P. 
papatasi  trypsins  co-localized  in  the  clade  I  containing  other  insect  trypsins,  while  their 
mammalian  counterparts  were  found  in  clade  II  (Figure  15).  As  detected  previously,  [13] 
PpTrypl  and  PpTryp2  form  a  different  clade  apart  from  a  clade  formed  by  PpTryp3  and 
PpTryp4  (Figure  15).  The  P.  papatasi  trypsins  PpTrypl,  PpTryp2,  PpTryp3,  PpTryp4, 
show  high  protein  sequence  similarity  to  Lu.  longipalpis  ESTs  NSFM-02a01,  NSFM- 
1 13h06,  NSFM-94b08,  and  NSFM-165c07,  respectively. 

Chymotrypsin 

Two  previously  characterized  P.  papatasi  chymotrypsin-like  cDNA,  PpChyml 
and  PpChym2  [13],  as  well  as  a  novel  chymotrypsin-like,  PpChym3  (Cluster  1 13)  were 
found  also  in  the  transcriptome  database.  This  newly  identified  novel  chymotrypsin-like 
molecule  was  found  in  low  abundance  in  the  blood-fed  midgut  library.  The  predicted 
Ppchym3  has  36%  amino  acid  identity  to  Ppchyml  and  30%  amino  acid  identity  to 
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Figure  15.  Phylogenetic  analysis  of  trypsins. 

Caenorhabditis  elegans(Cc),  Rattus  norvegicus  (Rn),  Mus  musculus  (Mm),  Homo 
sapiens  (Hs),  Blattella  germanica  (Bg),  Anopheles  gambiae  (Ag),  Anopheles  stephensi 
(As),  Aedes  aegypti  (Aa),  Drosophila  melanogaster  (Dm),  Culicoides  sonorensis  (Cs), 
and  Phlebotomus  papatasi  (Pp).  The  accession  number  of  the  sequence  used  is  in 
parenthesis  and  node  support  indicated  by  the  bootstrap  values. 
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Ppchym2.  Furthermore,  Ppchym3  has  a  signal  secretory  peptide  (Figure  16A)  and  has 
the  required  His/Asp/Ser  amino  acid  triad  necessary  for  catalytic  activity  (Figure  16B). 
Ppchyml  and  Ppchym2  both  share  sequence  homology  from  the  assembled  sequence 
NSFM-01d03  from  the  Lu.  longipalpis  EST  database,  while  Ppchym3  is  most  similar  to 
sequence  SFM-01b03. 

C  arboxypeptidase 

A  number  of  sequences  were  identified  with  homology  to  carboxypeptidases.  The 
full-  length  transcript  of  a  putative  carboxypeptidase  B,  PpCpepB ,  was  found  from  37 
sequences  in  cluster  16  and  has  high  homology  to  a  carboxypeptidase  B  identified  in 
Ae.  aegypti  (GenBank  accession#  AAT36733).  The  predicted  amino  acid  sequence  of 
PpCpepB  contains  a  signal  peptide,  a  propeptide  domain,  and  a  carboxypeptidase 
domain.  A  putative  carboxypeptidase  A,  PpCpepA,  was  also  identified  from  cluster  1 13 
based  on  amino  acid  sequence  homology.  Phylogenetic  analysis  shows  that  the  identified 
P.  papatasi  putative  carboxypeptidases  are  separated  into  distinct  clades  (Figure  17A). 
Comparison  of  sequence  homology  indicates  the  potential  for  these  molecules  to  have 
substrate  specificities  of  either  carboxypeptidases  A  or  B  (Figure  17B).  Sequence 
alignment  of  the  two  carboxypeptidases  depicts  the  difference  in  amino  acid  composition; 
however,  both  sequences  contain  the  zinc  ion  binding  motifs  of  metallocarboxypeptidases 
(Figure  17B).  Additionally,  the  presence  of  a  putative  signal  peptide  alludes  that  these 
molecules  are  midgut  digestive  enzymes.  Similarity  between  these  carboxypeptidases 
and  those  present  in  Lu.  longipalpis  EST  database  are  evident  by  the  high  homology 
between  PpCpepA  and  SFM-05cl  1  and  between  PpCpepB  and  NSFM-32d09. 
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Figure  16.  Chymo trypsin  sequence  analysis. 

(A)  Diagrammatic  representation  of  PpChym3  sequence  showing  the  predicted 
signal  peptide  (underlined)  and  the  residues  of  the  catalytic  triad  (H/D/S)  marked  with  a 
triangle.  (B)  Sequence  alignment  of  the  three  Phlebotomus  papatasi  chymotrypsin-like 
sequences.  Identical  residues  are  highlighted  in  black  and  similar  residues  highlighted  in 
grey.  The  predicted  signal  peptides  are  underlined,  the  catalytic  residues  marked  with  (*) 
and  the  accession  numbers  are  in  parentheses. 
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Figure  17.  Phlebotomus  papatasi  midgut  carboxypeptidase-like  proteins. 

(A)  Phylogenetic  analysis  of  carboxypeptidases  from  Caenorhabditis  elegans 
(Ce),  Aedes  aegypti  (Ae),  Anopheles  gambiae  (Ag),  Drosophila  melanogaster  (Dm), 
Ochlerotatus  triseriatus  (Ot),  Tribolium  castaneum  (Tc),  and  Phlebotomus  papatasi  (Pp). 
Accession  numbers  are  indicated  in  parentheses  and  node  support  indicated  by  the 
bootstrap  values.  (B)  Sequence  comparison  of  midgut  Phlebotomus  papatasi 
carboxypeptidase  A  (PpCpepA)  and  carboxypeptidase  B  (PpCpepB).  The  predicted 
signal  peptide  is  underlined,  and  the  residues  necessary  for  zinc  binding  (H  and  E)  are 
indicated  by  (*). 
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Astacin-like  zinc  metalloprotease 

A  putative  astacin-like  zinc  metalloprotease  ( PpAstacin )  was  identified  from 
cluster  37,  a  product  of  five  sequences.  This  putative  astacin-like  protein  displays  a 
predicted  signal  peptide  and  a  slightly  modified  form  of  the  signature  zinc  binding 
catalytic  domain  for  proteins  in  the  astacin  family  (HEXXHXXGFXHEXXRXDR).  In 
PpAstacin,  changes  in  two  residues  (E  to  M  and  R  to  A)  resulted  in  the  motif 
HEFLHALGFFHMQSASDR  (Figure  18).  Although  the  altered  residues  may  be 
involved  in  target  specificity  the  zinc-binding  catalytic  domain  remains  conserved.  The 
likely  role  of  this  putative  protein  is  blood  meal  digestion,  as  astacins  molecules  have  not 
been  implicated  in  immune  functions  and  a  considerable  number  of  transcripts 
constituting  this  cluster  were  derived  from  the  blood-fed  midgut  cDNA  library.  This  is 
the  first  report  of  this  type  of  protease  from  the  gut  of  a  sand  fly,  though  NSFM-127b08 
of  the  Lu.  longipalpis  EST  database  was  identified  based  on  sequence  homology. 

Kazal-type  serine  protease  inhibitor 

Two  Kazal-type  serine  protease  inhibitors  were  identified  from  cluster  111 
( PpKZLl )  and  859  ( PpKZL2 )  in  the  cDNA  midgut  libraries.  PpKZLl  codes  for  a  small 
peptide  of  78  amino  acids  while  PpKZL2  codes  for  a  peptide  of  89  amino  acids.  Both 
proteins  are  predicted  to  be  secreted  based  on  the  presence  of  signal  peptides  (Figure  19). 
PpKZFl  is  similar  to  various  small  Kazal-type  inhibitors  found  in  Drosophila 
pseudoobscura  (gi:  125986397),  C.  sonorensis  (gi:  56199538)  and  the  mosquitoes  Ae. 
aegypti  and  An.  gambiae,  and  to  larger  Kazal-type  molecules  such  as  infestin  [23]  from 
Triatoma  infestans  (Figure  19A).  There  is  only  28%  identity  and  42  %  similarity 
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Figure  18.  Multiple  sequence  analysis  of  astacin-like  proteins. 

Sequence  alignment  of  zinc  proteases  astacin-like  sequences  from  Phlebotomus 
papatasi  (Pp),  Aedes  aegypti  (Ae),  Anopheles  gambiae  (Ag),  Cidicoides  sonorensis  (Cs), 
Drosophila  melanogaster  (Dm),  Glossina  morsitans  morsitans  (Gm),  Astacus  astacus 
(As),  Caenorhabditis  elegans  (Ce),  Mus  musculus  (Mm),  and  Homo  sapiens  (Hs). 
Arrows  indicate  the  residues  likely  necessary  for  catalytic  activity.  Accession  numbers 


are  shown. 
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Figure  19.  Sequence  analysis  of  Kazal-type  proteins. 

(A)  Sequence  alignment  of  Kazal-type  proteins  from  Phlebotomus  papatasi  (Pp), 
Aedes  aegypti  (Ae),  Culicoides  sonorensis  (Cs),  Drosophila  melanogaster  (Dm)  and 
Triatoma  infestans  (Ti).  The  predicted  signal  peptide  sequences  are  underlined  and  the 
conserved  cysteine  residues  denoted  by  #.  Identical  residues  are  highlighted  in  black  and 
similar  residues  highlighted  in  grey.  PpKZLl  accession  number  is  EU045342  (B) 
Sequence  comparison  of  the  two  Kazal-type  proteins  (PpKZLl  and  PpKZL2)  from 
Phlebotomus  papatasi  found  in  the  midgut  cDNA  libraries.  Identical  residues  are 
highlighted  in  black  and  similar  residues  highlighted  in  grey. 
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between  PpKZLl  and  PpKZL2  (Figure  19B)  suggesting  these  may  have  different 
functions.  Additionally,  these  two  Kazal-type  cDNAs  are  similar  to  two  previously 
characterized  thrombin  inhibitors,  rhodniin  and  infestin,  from  the  triatomines  Rhodnius 
prolixus  [24]  and  T.  infestans  [23],  respectively.  Due  to  their  anti-hemostatic  effect, 
rhodniin  and  infestin  are  believed  to  play  a  role  in  the  fluidity  of  the  blood  within  the 
midgut  of  these  vectors.  It  is  conceivable  that  one  or  both  transcripts  coding  for  Kazal- 
type  thrombin  inhibitors  identified  in  P.  papatasi  may  play  a  role  in  blood  fluidity  within 
the  sand  fly  midgut,  allowing  it  to  be  fully  digested  by  the  various  proteases  secreted 
within  the  midgut  following  the  blood  meal.  These  represent  the  first  Kazal-type  serine 
protease  inhibitors  identified  from  sand  flies.  PpKZL2  shares  low  sequence  similarity 
with  SFM-0406  from  the  Lu.  longipalpis  EST  database  and  no  significant  similarities 
were  identified  for  PpKZLl. 

Ferritin 

Two  transcripts  encoding  putative  ferritin  light  (. PpFLC)  and  heavy  (PpFIIC) 
chain  subunits  were  identified  in  clusters  103  and  122,  respectively  (Figure  20).  After  the 
ingestion  of  a  blood  meal,  the  fly  encounters  a  tremendous  dose  of  iron  and  heme,  which 
would  be  fatal  to  most  organisms.  Ferritin  is  one  of  the  important  factors  in  controlling 
the  high  iron  load  in  hematophagous  insects.  The  midgut  of  blood-feeding  insects 
envelopes  the  blood  meal  and  consequently  makes  the  midgut  tissue  the  most  likely  site 
of  iron  regulatory  molecules.  However,  ferritin  also  may  be  important  for  oxidative 
stress  not  related  to  the  presence  of  iron  or  heme,  as  it  is  induced  by  the  presence  of  H2O2 
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Figure  20.  Sequence  analysis  of  ferritin  heavy  and  light  chain  molecules. 

Sequence  alignment  of  sequences  from  Aedes  aegypti  (Ae),  Anopheles  gambiae 
(Ag),  Glossina  morsitans  morsitans  (Gm),  Drosophila  melanogaster  (Dm),  and 
Phlebotomus  papatasi  (Pp).  (A)  Light-chain  ferritin  subunits.  (B)  Heavy-chain  ferritin 
subunit.  Arrows  indicate  residues  associated  with  the  ferroxidase  center,  the  predicted 
signal  peptide  sequence  is  underlined  and  the  accession  numbers  are  given. 


78 

in  Ae.  aegypti  [25].  PpFLC  and  PpFHC  are  similar  to  NSFM-144g07  and  NSFM- 
146d09,  respectively;  molecules  identified  by  searching  the  Lit.  longipalpis  EST 
database. 

Glutathione  S-transferase  (GST) 

From  clusters  125  and  232,  two  transcripts  were  identified  to  encode  putative 
GSTs  with  homology  to  other  Dipteran  GSTs  in  the  Sigma  and  Delta/Epsilon  classes, 
respectively.  The  predicted  molecular  weights  of  the  two  putative  proteins  are  similar  at 
23.2  kDa  for  cluster  125  and  24.5  kDa  for  cluster  232.  Within  the  midgut,  these  proteins 
may  play  an  important  role  in  the  regulation  of  reactive  oxygen  species  which  occur  as  a 
by-product  of  hemoglobin  digestion.  Cluster  125  and  232  share  high  protein  sequence 
similarity  withfw.  longipalpis  ESTs  NSFM-105el0  and  NSFM-74cl  1,  respectively. 

Unknown  proteins 

A  large  number  of  clusters  produced  by  the  three  cDNA  libraries  have  no 
sequence  similarity  to  other  known  proteins.  This  has  also  been  observed  in  the  analysis 
of  the  Chironomus  tentans  midgut,  with  good  evidence  that  the  unknown  transcripts 
contained  coding  sequences  [26].  It  is  also  possible  that  the  abundance  of  unidentifiable 
sequences  may  be  caused  by  the  sequence  quality  of  the  transcripts  or  that  the  captured 
sequences  are  3'  untranslated  regions,  non-coding  small  nuclear  RNA,  or  sequences  of 
uncharacterized  organisms  such  as  bacteria  and  yeast  present  in  the  sand  fly  midgut.  A 
number  of  clusters  with  unknown  functions  were  identified  as  coding  sequences  that 
exhibited  signal  peptides,  such  as  clusters  1 1  and  126. 
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Functionally  characterized  proteins 

From  the  three  cDNA  libraries,  we  identified  chitinase  transcripts  that  were  then 
expressed  as  recombinant  proteins  for  the  demonstration  of  activity  in  the  midgut  of 
P.  papatasi  sand  flies  [15].  Another  product  of  the  cDNA  libraries  was  the  identification 
and  characterization  of  a  galectin  protein  as  the  first  arthropod  receptor  for  a  parasite; 
specifically,  L.  major  within  the  P.  papatasi  sand  fly  midgut  [1]. 

Comparative  analysis  of  transcripts  that  significantly  differ  from  the  sugar-fed  and 
blood-fed  midgut  cDNA  libraries 

To  investigate  the  effects  of  blood-feeding  on  the  midgut  expression  profile  in  P. 
papatasi,  we  compared  the  abundance  of  transcripts  in  sugar  and  blood-fed  cDNA 
libraries.  We  hypothesized  that  a  blood  meal  will  have  an  effect  on  the  expression  of 
sand  fly  midgut  transcripts  that  will  be  reflected  in  the  relative  abundance  of  sequences 
forming  a  cluster  in  the  two  libraries.  Chi-square  statistical  analysis  was  used  to  evaluate 
the  significance  of  the  differences  in  the  abundance  of  midgut  transcripts  from  unfed  and 
blood-fed  cDNA  libraries;  thereby,  identifying  different  expression  profiles  of  selected 
midgut  transcripts  in  each  cDNA  library. 

We  observed  a  significant  difference  (P  value  <  0.05)  in  the  abundance  of  a 
number  of  midgut  transcripts  when  we  compared  the  sugar-fed  and  blood-fed  sand  fly 
midgut  cDNA  library.  Table  5  shows  a  list  of  selected  transcripts  that  were  either  more 
abundantly  or  less  abundantly  expressed  in  these  two  cDNA  libraries. 
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Table  5:  Clusters  overrepresented  in  the  sugar-fed  and  blood-fed  midgut  cDNA  libraries 


as  determined  by  %2  statistical  analysis 


Putative  function 

Sugar  fed 

Blood  fed 

P  value 

Genbank 

Microvilli  protein  (PpMVPI) 

0 

195 

4.3E-58 

EU031911 

Microvilli  protein  (PpMVP2) 

1 

60 

1.9E-17 

EU047549 

Microvilli  protein  (PpMVP3) 

39 

8 

1.  IE-04 

EU047550 

Microvilli  protein  (PpMVP4) 

0 

18 

2.4E-06 

EU047551 

Peritrophin  (PpPerl) 

0 

54 

1.9E-16 

EU031912 

Peritrophin  (PpPer2) 

152 

45 

1. IE-10 

EU047543 

Ferritin  light  chain  (PpFLC) 

6 

18 

2.9E-03 

EU045344 

Chymotrypsin  (Ppchym2) 

0 

36 

2. IE-11 

AY128107 

Trypsin  (PpTrypI) 

86 

10 

4.4E-14 

AY128108 

T rypsin  (PpT ryp4) 

0 

52 

6.9E-16 

AY1281 1 1 

Unknown  (Cluster  73) 

13 

21 

4.6E-02 

EU045347 

Unknown  (Cluster  99) 

0 

29 

1.9E-09 

EU045345 
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As  expected,  transcripts  coding  for  proteolytic  enzymes  such  as  trypsin 
( PpTryp4 ),  and  chymotrypsin  ( PpChym2 )  were  more  abundantly  represented  in  the 
blood-fed  cDNA  library  than  in  the  sugar-fed  cDNA  library  (Table  5).  Other  transcripts 
coding  for  peritrophin  and  microvilli-like  proteins  and  ferritin  also  were  more  abundantly 
represented  in  the  blood-fed  cDNA  library.  Also,  we  observed  a  number  of  transcripts 
that  were  less  abundantly  represented  in  the  blood-fed  cDNA  library,  such  as  trypsin  1 
( PpTrypl ),  and  peritrophin  ( PpPer2 ). 

Validation  of  transcript  abundance  of  selected  sequences  by  real-time  PCR 

In  order  to  validate  the  results  observed  by  the  chi-square  analysis,  we  further 
characterized  several  transcripts  by  semi-quantitative  end-point  reverse-transcriptase 
PCR  as  well  as  by  real-time  PCR.  These  were  utilized  to  assess  the  relative  abundance  of 
transcripts  in  the  midgut  tissue  under  sugar-fed  and  blood-fed  conditions.  The 
investigated  transcripts  included  peritrophins  PpPerl  and  PpPer2,  as  well  as  microvilli 
proteins  PpMVPl,  PpMVP2,  and  PpMVP4. 

The  results  of  semi-quantitative  PCR  can  be  seen  in  Figures  2  IB  and  2  ID  and  the 
induction  of  PpPerl  is  clearly  seen.  PpPer2  abundance  between  the  two  midguts 
conditions  is  less  clear.  Figure  21A  shows  the  transcript  abundance  of  PpPerl  as  fold 
abundance  over  the  control  gene  in  non  blood-fed  and  post-blood  meal  ingestion  as 
measured  by  real-time  PCR.  Figure  2 1C  shows  the  same  real-time  PCR  analysis  of  the 
PpPer2  transcript.  The  profile  of  the  peritrophin  transcripts  by  real-time  PCR  strongly 
correlates  with  the  profile  found  in  the  libraries  based  on  the  number  of  sequences. 
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Figure  21.  Comparative  abundance  of  peritrophin  transcripts  in  sugar-fed  or  blood- fed 
sand  flies. 

(A,  C)  PpPerl  and  PpPer2  transcripts  fold  over  control  (reference  transcript  = 
alpha  tubulin)  in  unfed  and  blood-fed  P.  papatasi  midgut.  (B,  D)  Semi-quantitative  PCR 
amplified  PpPerl  and  PpPer2  transcripts  separated  by  agarose  electrophoresis. 
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Based  on  real-time  PCR,  PpPerl  expression  is  induced  by  blood  digestion,  and  it  is  not 
detected  in  sugar-fed  midguts,  corresponding  with  the  lack  of  any  sequences  produced  in 
the  sugar-fed  midgut  cDNA  library,  compared  to  54  sequences  found  in  the  blood  meal 
library.  As  predicted  by  the  high  sequence  abundance  of  PpPer2  in  the  sugar-fed  cDNA 
library  the  expression  of  this  transcript  is  highest  in  unfed  sand  flies  and  seems  to  be 
downregulated  by  the  ingestion  of  a  blood  meal. 

Transcription  levels  of  mRNAs  coding  for  microvilli-like  proteins  ( PpMVPl , 
PpMVP2,  and  PpMVP4)  tested  by  semi-quantitative  PCR  and  real-time  PCR  are  shown 
in  Figure  22  and  illustrate  the  induction  of  transcription  by  the  ingestion  of  a  blood  meal. 
This  mirrors  what  is  seen  by  the  sequence  abundance  of  the  cDNA  library,  in  which  only 
one  sequence  of  PpMVP2  was  observed  in  the  sugar-fed  cDNA  library.  The  remaining 
sequences  were  contributed  by  the  cDNA  library  produced  from  blood-fed  sand  flies. 

Pptrypl  and  Pptryp4  low  and  high  transcript  abundance,  respectively,  were  in 
accordance  with  the  results  of  previously  published  endpoint  reverse-transcriptase  PCR 
[13].  Additionally,  the  previously  characterized  chitinase  molecule,  PpChitl,  was 
identified  in  cluster  243  produced  by  three  sequences  contributed  by  the  blood-fed  cDNA 
library  with  none  present  in  the  sugar-fed  cDNA  library.  The  mRNA  expression  levels  of 
PpChitl  peak  at  72  hours  post-blood  meal  ingestion  [15]. 

Comparative  analysis  of  transcripts  significantly  differs  from  the  blood-fed  and 
L.  /Mayor-infected  midgut  cDNA  libraries 

During  its  development  within  the  midgut  of  the  sand  fly,  Leishmania  is  faced 
with  various  potential  barriers  that  may  prevent  the  establishment  of  the  infection. 
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Figure  22.  Transcript  abundance  of  microvilli  associated-like  proteins  compared 
between  unfed  and  blood-fed  sand  flies. 

A,  C,  E:  PpMVPI,  PpMVP2,  and  PpMVP4  transcript  fold  over  control  (reference 
transcript  =  alpha  tubulin)  in  unfed  and  blood-fed  P.  papatasi  midgut.  B,  C,  F:  PpMVPI, 
PpMVP2,  and  PpMVP4  semi-quantitative  PCR  amplified  transcripts  separated  by  agarose 
electrophoresis. 
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Table  6:  Clusters  overrepresented  in  the  blood-fed  and  Leishmania  major- infected  sand 

2 

fly  midgut  cDNA  libraries  as  determined  by  %  statistical  analysis 


Putative  function 

Blood  fed 

L.  major 

P  value 

Genbank 

Microvilli  protein  (PpMVPI) 

134 

70 

5.8E-07 

EU031911 

Microvilli  protein  (PpMVP2) 

60 

42 

4.  IE-02 

EU047549 

Peritrophin  (PpPerl) 

54 

16 

1.7E-06 

EU031912 

Peritrophin  (PpPer2) 

45 

35 

1.8E-02 

EU047543 

Ferritin  light  chain  (PpFLC) 

18 

3 

7.1E-04 

EU045344 

Chymotrypsin  (Ppchym2) 

36 

8 

1.  IE-05 

AY128107 

Trypsin  (PpTrypI) 

10 

82 

1.0E-13 

AY128108 

Unknown  (Cluster  73) 

21 

6 

2.6E-03 

EU045347 

Unknown  (Cluster  99) 

29 

5 

1.9E-05 

EU045345 
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Among  such  potential  barriers  are  digestive  proteases  (trypsins  and  chymotrypsins),  the 
peritrophic  matrix  and  the  requirement  for  parasite  attachment  to  the  midgut  epithelia  to 
prevent  excretion  of  parasites  with  remnants  of  the  digested  blood.  Previous  data 
suggested  that  Leishmania  is  able  to  downregulate  proteolytic  activity  in  the  sand  fly 
midgut  [4].  Also,  chitinases  produced  either  by  the  sand  fly  [15]  or  by  the  Leishmania 
[27]  facilitates  parasites  in  the  escape  from  the  peritrophic  matrix.  Attachment  to  the 
midgut  epithelia  occurs  via  the  presence  of  L.  major  lipophosphoglycan  receptors,  such 
as  PpGalec  [1]  or,  in  the  case  of  permissive  sand  flies,  via  the  presence  of  midgut 
glycoproteins  bearing  tenninal  N-acetyl-galactosamine  [28]. 

In  sand  flies,  only  a  handful  of  midgut  proteins  have  been  clearly  implicated  in 
Leishmania  development.  Previous  data  indicated  that  Leishmania  is  able  to  manipulate 
the  activity  of  certain  digestive  proteases,  inhibiting  or  delaying  their  peak  activity, 
possibly  in  order  to  survive  the  proteolytic  attack  it  faces  in  the  midgut  of  the  vector  [3, 
27].  We  hypothesized  that  a  blood  meal  containing  L.  major  will  affect  the  expression 
profile  of  midgut  transcripts  altering  the  abundance  of  the  different  transcripts  in  each  of 
these  cDNA  libraries.  Table  6  shows  the  results  of  the  chi-square  analysis  when 
transcripts  from  the  blood-fed  and  L.  mo/or-infected  blood-fed  cDNA  libraries  were 
compared.  Of  interest,  the  abundance  of  transcripts  coding  for  proteolytic  enzymes  were 
dramatically  decreased  in  the  midgut  cDNA  library  of  sand  flies  fed  on  L.  major- infected 
blood.  Additionally,  other  transcripts  that  also  appear  to  have  their  number  reduced 
included  those  coding  for  microvilli-associated  like  proteins  and  peritrophins. 

Transcripts  such  as  the  one  corresponding  to  PpTrypl  (trypsin  1)  and  one  corresponding 
to  PpPer2  (peritrophin  2)  were  more  abundant.  Other  transcripts  coding  for  unknown 
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proteins  also  were  less  abundant  in  the  L.  major- infected  cDNA  library  than  in  the  blood- 
fed  cDNA  library.  These  data  suggest  that  the  parasite  may  be  affecting  the  expression 
profile  of  these  transcripts,  and  this  inhibition,  particularly  of  proteolytic  enzymes,  may 
be  advantageous  for  the  survival  and  establishment  of  the  parasite  in  the  midgut  of  the 
sand  fly. 


Conclusion 

Development  of  Leishmania  within  its  sand  fly  host  is  largely  restricted  to  the 
vector  midgut.  Within  the  midgut  Leishmania  begins  its  development  confined  within  a 
peritrophic  matrix  and  subjected  to  the  onslaught  of  digestive  enzymes.  Later,  the 
parasites  attach  to  the  epithelia  to  prevent  excretion  with  remnants  of  the  blood  meal  and 
detach  as  they  develop  into  the  infective  metacyclic  form  before  being  transmitted  to  a 
suitable  host  during  a  subsequent  blood  meal.  The  sand  fly  midgut  presents  to  the 
parasite  a  number  of  biological  barriers  the  Leishmania  parasite  must  circumnavigate  or 
defeat  to  proliferate  and  develop  inside  the  insect  vector.  Acquiring  a  better 
understanding  of  the  molecules  present  in  this  organ  will  illuminate  the  potential 
molecular  interactions  occurring  between  the  Leishmania  parasite  and  the  sand  fly  vector. 
Comparative  transcriptome  analysis  provides  a  powerful  global  approach  as 
demonstrated  by  the  repertoire  of  molecules  identified  from  a  whole  organism  or  from  a 
specific  tissue  and  the  generation  of  new  hypotheses  from  these  data.  Large  scale 
genome  analyses  benefit  from  data  generated  from  transcriptome  analyses,  for  example, 
by  aiding  in  the  annotation  of  exons  and  introns. 
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The  results  of  the  present  work  provide  insights  into  the  repertoire  of  the 
molecules  present  in  the  midgut  of  the  sand  fly  P.  papatasi,  the  natural  vector  of  L. 
major.  We  identified  a  variety  of  molecules  and  obtained  high  quality,  full-length 
sequences  from  many  of  them.  The  high  quality  sequences  were  deposited  at  NCBI, 
significantly  augmenting  the  available  midgut-specific  coding  sequences.  A  large 
number  of  non-annotated  sequences  were  deposited  in  the  EST  database  for  the  scientific 
communities  to  access  these  transcripts. 

The  global  changes  in  sand  fly  midgut  expression  profile  were  assessed  by 
comparing  data  generated  from  randomly  sequenced  midgut  cDNA  clones  obtained  from 
cDNA  libraries  of  adult  females  fed  on  sugar  only,  blood  or  blood  with  the  addition  of  L. 
major.  Our  approach  allowed  for  the  identification  of  transcripts  that  are  induced  by 
blood-feeding  and  likely  participate  in  the  digestion  of  the  blood  meal  and  events  leading 
to  egg  production.  Digestion  of  blood  as  a  nutritional  source  is  complicated  by  the 
cellular  and  molecular  response  and  components  of  the  blood  itself,  once  ingested  by  the 
insect  vector.  Transcripts  identified  in  the  P.  papatasi  midgut,  such  as  ferritin,  Kazal- 
type  serine  protease  inhibitors,  and  GST,  are  examples  of  the  molecules  identified  on  the 
gut  of  this  insect.  Additionally,  the  inclusion  of  a  L.  major- infected  midgut  cDNA  library 
provides  insight  into  genes  potentially  regulated  by  this  parasite  during  its  development 
within  the  sand  fly  midgut.  The  random  sequencing  approach  followed  by  the  in  silico 
analysis  of  the  transcript  abundance  was  supported  by  experimental  analyses  obtained  via 


real-time  PCR. 
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Overall,  this  analysis  will  contribute  to  the  understanding  of  the  molecular 
interactions  between  Leishmania  and  the  sand  fly  vector  and  may  open  new  avenues  for 
basic  research  towards  the  control  of  this  neglected  vector-borne  disease. 

Methods 

Sand  flies 

Phlebotomus  papatasi  sand  flies  (Saudi  Arabia  strain)  were  obtained  from 
colonies  maintained  at  Walter  Reed  Army  Institute  for  Research  (WRAIR)  and  at 
NIAID-NIH.  Three-  to  5-day  old  female  sand  flies  were  fed  either  on  20%  sucrose 
solution  (sugar- fed)  or  on  B ALB/c  mouse  whole  blood,  via  artificial  meals  [1],  with  or 
without  the  addition  of  2  x  I O6  L.  major  (VI  strain)  amastigotes  per  ml. 

Messenger  RNA  extraction  and  cDNA  library  construction 

Phlebotomus  papatasi  female  midguts  (10  midguts)  were  dissected  from  sugar- 
fed  only,  from  blood-fed  at  6h  (6  midguts),  24h,  48h  and  72h  post-blood  meal  PBM  (5 
midguts  each)  and  from  L.  major- infected  at  16h  (3  midguts),  22h  and  96h  (5  midguts 
each)  post  infection  (p.i.).  For  blood-fed  and  for  L.  major- infected,  groups  of  midguts 
were  pooled  for  RNA  extraction.  Pooling  was  done  for  the  sugar-fed  group  as  well. 
Messenger  RNA  was  purified  with  the  Micro-FastTrack  mRNA  isolation  kit  (Invitrogen- 
Life  Technologies,  Carlsbad,  CA),  and  100  ng  of  mRNA  was  used  to  produce  a  first 
strand  cDNA.  A  cDNA  library,  enriched  for  full-length  cDNA,  was  synthesized  using 
the  SMART  cDNA  library  construction  kit  (Clontech  Laboratories,  Mountain  View,  CA). 
One  microgram  of  double  stranded  DNA  for  each  original  library  (sugar-fed,  blood-fed, 
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L.  major- infected)  was  fractionated  using  a  Chromaspin  1000  column  (Clontech 
Laboratories,  Mountain  View,  CA)  into  small  (S),  medium  (M)  and  large  (L)  transcripts 
based  upon  their  electrophoresis  profde  on  a  1.1%  agarose  gel.  Pooled  fractions  were 
ligated  into  Lambda  TriplEx2  vector  (Clontech,  Mountain  View,  CA)  and  packaged  into 
lambda  phage  (Stratagene,  La  Jolla,  CA).  Individual  libraries  were  plated  on  LB  agar 
plates  in  order  to  achieve  roughly  200-300  plaques  per  182mm  plates. 

Random  sequencing 

Unidirectional  sequencing  of  randomly  selected  clones  was  completed  as 
previously  described  [10].  Single,  isolated  plaques  were  picked  from  the  plate  using 
sterile  wooden  sticks  and  placed  into  70  ul  of  water.  Amplification  of  the  cDNA  was 
performed  using  Platinum  PCR  SuperMix  (Invitrogen),  4pl  template,  and  primers  PT2F1 
(AAG  TAC  TCT  AGC  AAT  TGT  GAG  C)  and  PT2R1  (CTC  TTC  GCT  ATT  ACG  CCA 
GCT  G).  PCR  amplification  products  were  cleaned  using  either  Multiscreen  PCR 
cleaning  plates  (Millipore)  or  Edge  Biosystems  PCR  cleaning  plates  and  three  washes 
with  ultra  pure  water.  The  cleaned  PCR  product  was  resuspended  in  25  pi  of  water,  of 
which  4pl  were  used  for  cycle  sequencing  with  PT2F3  primer  (TCT  CGG  GAA  GCG 
CGC  CAT  TGT)  and  either  DTCS  reaction  kit  (Beckman)  or  Big  Dye  3.1  (Applied 
Biosystems).  Sequencing  reaction  products  were  cleaned  using  Sephadex  G-50  (GE 
Healthcare)  in  a  multiscreen  cleaning  plate  (Millipore)  and  analyzed  using  either 
CEQ8000  (Beckman  Coulter)  or  ABI3700  (Applied  Biosystems)  DNA  sequencing 


instrument. 
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Bioinformatic  analysis 

Detailed  description  of  the  bioinfonnatic  analysis  of  the  data  appear  in  [10,  29]. 
Briefly,  prior  to  analysis  the  vector  sequence  was  removed  from  the  cDNA  nucleotide 
sequences.  Sequence  data  from  the  three  libraries  were  grouped  together  and  aligned  to 
generate  clusters  of  contiguous  sequences  or  contigs  based  on  90%  homology  over  90 
nucleotides,  after  sequences  with  more  than  5%  Ns  were  discarded.  Three  frame 
translations  of  the  consensus  sequence  of  each  contig  were  subjected  to  comparison  using 
the  appropriate  BLAST  algorithm  to  the  NCBI  non-redundant  protein  database, 
conserved  domain  database  [30]  that  contains  the  eukaryotic  clusters  of  ortho logous 
groups  (COG),  Simple  Modular  Architecture  Tool  (SMART)  and  Protein  Family 
Database  (Pfam),  and  the  Gene  Ontology  database  [31].  Nucleotide  sequences  were 
directly  compared  with  two  customized  databases,  mitochondrial  and  ribosomal  RNA 
(rRNA)  nucleotide  databases  using  BlastN.  Detennination  of  the  presence  of  a  signal 
secretion  peptide  or  transmembrane  helices  was  accomplished  by  the  submission  of 
sequence  peptides  to  the  SignalP  server  [32]  or  TMHMM  server  [33],  respectively.  The 
Lu.  longipalpis  BLAST  server  was  utilized  to  determine  homology  between  the  P. 
papatasi  clusters  and  Lu.  longipalpis  ESTs  [34],  The  number  of  transcripts  each  library 
contributed  to  a  particular  contig  was  derived  using  a  custom  program,  Count  Libraries 
(JMC  Ribeiro,  personal  communication).  Comparisons  between  the  sugar-fed  and  blood- 
fed  midgut  cDNA  sequences  and  comparisons  between  blood-fed  and  L.  major- infected 
midgut  cDNA  sequences  were  based  on  separate  Chi-square  analysis  [35].  The  grouped 
and  assembled  sequences,  BLAST  results  and  signal  peptide  results  were  combined  in  an 
Excel  spreadsheet  and  the  putative  function,  if  any,  was  manually  verified  and  annotated. 
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Sequences  were  aligned  using  Clustal  X,  version  1.83,  and  converted  to  graphical  aligned 
sequences  using  BioEdit,  version  7. 0.5. 3  [36].  Phylogenetic  analysis  was  conducted  on 
amino  acid  alignments  using  TREE-PUZZLE,  version  5.2,  generating  trees  by  maximum 
likelihood  using  quartet  puzzling  to  calculate  node  support  [37]. 

Quantitative  PCR 

Quantitative  PCR  (qPCR)  was  perfonned  in  selected  clones  using  the  first-strand 
cDNA,  obtained  from  lOOng  total  RNA  isolated  from  midguts  dissected  from  P.  papatasi 
females  fed  on  sugar  (unfed)  or  dissected  after  a  blood  meal  (24-72h  post-blood  meal  or 
PBM).  cDNAs  were  synthesized  using  the  1st  Strand  cDNA  Synthesis  kit  (Invitrogen, 
San  Diego  CA).  Transcript  levels  were  measured  with  SYBR  green  dye  using  a 
LightCycler  2.0  (Roche  Diagnostics,  Manheim,  Germany).  For  qPCR  reactions,  samples 
were  subjected  to  an  initial  holding  step  at  95°C  for  15  minutes,  followed  by  an 
amplification  step  consisting  of  35  cycles  of  95°C  for  10  seconds,  54°C  for  20  seconds 
and  72°C  for  20  seconds  with  a  single  acquisition.  The  reaction  continued  with  a  single¬ 
cycle  melting  step  of  95°C  for  10  seconds,  67°C  for  30  seconds  and  95°C  for  10  seconds, 
prior  to  cooling  for  1  minute.  Equal  amounts  of  cDNA  were  amplified  using  gene- 
specific  primer  sets  targeting  individual  transcripts  as  well  as  a  P.  papatasi  alpha  tubulin, 
as  control  or  reference  transcript.  Reactions  were  routinely  done  in  duplicate.  The 
relative  expression  ratio  of  the  target  transcript  and  control  or  reference  transcript  (fold 
over  control)  was  calculated  using  the  LightCycler  relative  quantification  software 


(Roche). 
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Semi-quantitative  PCR 

Semi  quantitative  RT-PCR  reactions  were  performed  with  selected  transcripts  to 
further  demonstrate  the  differential  expression  of  these  genes  in  P.  papatasi  midgut.  In 
this  case,  lOOng  of  total  RNA  isolated  from  midguts  dissected  from  P.  papatasi  females 
fed  on  sugar  (unfed)  or  dissected  after  a  blood  meal  (48h  PBM)  were  used  to  synthesize  a 
cDNA  using  the  1st  Strand  cDNA  Synthesis  kit  (Invitrogen).  PCR  reactions  were  carried 
out  by  an  initial  hot  start  at  95°C  for  5  minutes  followed  by  25  cycles  of  95°C  for  30 
seconds,  54°C  for  1  minute  and  72°C  for  1 .5  minutes  and  a  final  extension  cycle  of  72°C 
for  5  minutes.  PCR  products  were  separated  on  1.5%  agarose. 
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Abstract 

Background:  In  the  life  cycle  of  Leishmania  within  the  alimentary  canal  of  sand 
flies,  the  parasites  have  to  survive  the  hostile  environment  of  blood  meal  digestion, 
escape  the  blood  bolus  and  attach  to  the  midgut  epithelium  before  differentiating  into  the 
infective  metacyclic  stages.  The  molecular  interactions  between  the  Leishmania  parasites 
and  the  gut  of  the  sand  fly  are  poorly  understood.  In  the  present  work,  we  sequenced  five 
cDNA  libraries  constructed  from  midgut  tissue  from  the  sand  fly  Lutzomyia  longipalpis 
and  analyzed  the  transcripts  present  following  sugar-feeding,  blood-feeding  and  after  the 
blood  meal  had  been  processed  and  excreted,  both  in  the  presence  and  absence  of 
Leishmania  infantum  chagasi. 

Results:  Comparative  analysis  of  the  transcripts  from  sugar-fed  and  blood-fed 
cDNA  libraries  resulted  in  the  identification  of  transcripts  differentially  expressed  during 
blood-feeding.  These  included  upregulated  transcripts  such  as  four  distinct  microvillar- 
like  proteins  (LuloMVPl,  2,  4  and  5),  two  peritrophin  like  proteins,  a  trypsin  like  protein 
(Lltrypl),  two  chymotrypsin  like  proteins  (LuloChymlA  and  2)  and  an  unknown  protein. 
Downregulated  transcripts  by  blood-feeding  were  a  microvillar-like  protein  (LuloMVP3), 
a  trypsin  like  protein  (Lltryp2)  and  an  astacin-like  metallopro tease  (LuloAstacin). 
Furthermore,  a  comparative  analysis  between  blood-fed  and  Leishmania- infected  midgut 
cDNA  libraries  resulted  in  the  identification  of  transcripts  that  were  differentially 
expressed  due  to  the  presence  of  Leishmania  in  the  gut  of  the  sand  fly.  This  included 
downregulated  transcripts  such  as  four  microvillar-like  proteins  (LuloMVPl, 2,  4  and  5), 
a  chymotrypsin  (LuloChymlA)  and  a  carboxypeptidase  (LuloCpepAl),  among  others. 
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Upregulated  midgut  transcripts  in  the  presence  of  Leishmania  were  a  peritrophin-like 
protein  (LuloPerl),  a  trypsin-like  protein  (Lltryp2)  and  an  unknown  protein. 

Conclusion:  This  transcriptome  analysis  represents  the  largest  set  of  sequence 
data  reported  from  a  specific  sand  fly  tissue  and  provides  further  information  of  the 
transcripts  present  in  the  sand  fly  Lutzomyia  longipalpis.  This  analysis  provides  the 
detailed  information  of  molecules  present  in  the  midgut  of  this  sand  fly  and  the 
transcripts  potentially  modulated  by  blood-feeding  and  by  the  presence  of  the  Leishmania 
parasite.  More  importantly,  this  analysis  suggests  that  Leishmania  infantum  chagasi 
alters  the  expression  profile  of  certain  midgut  transcripts  in  the  sand  fly  during  blood 
meal  digestion  and  that  this  modulation  may  be  relevant  for  the  survival  and 
establishment  of  the  parasite  in  the  gut  of  the  fly.  Moreover,  this  analysis  suggests  that 
these  changes  may  be  occurring  during  the  digestion  of  the  blood  meal  and  not 
afterwards. 


Background 

Leishmaniasis  is  a  spectrum  of  diseases  caused  by  numerous  species  of  the 
kinetoplastid  parasite  Leishmania,  which  are  transmitted  by  Phlebotomine  sand  flies. 
Different  forms  of  disease  presentation  can  be  linked  with  the  various  species  of 
Leishmania  parasites,  with  the  visceral  form  of  the  disease  being  caused  mainly  by  the 
Old  World  Leishmania  infantum  or  the  New  World  variant  Leishmania  infantum 
{chagasi).  Visceral  leishmaniasis  is  a  disease  that  is  commonly  fatal  if  left  untreated. 
Currently,  there  is  no  licensed  vaccine  for  the  prevention  of  visceral  disease  in  humans, 
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and  current  drug  treatment  with  antimonials  and  other  components  is  a  lengthy  and 
arduous  procedure  with  undesirable  secondary  effects  [1]. 

The  sand  fly  Lutzomyia  longipalpis,  the  principal  vector  of  the  parasite 
Leishmania  infantum  chagasi,  is  the  most  significant  source  of  American  visceral 
leishmaniasis.  As  with  many  other  arthropod-borne  diseases,  transmission  of  the 
Leishmania  parasite  occurs  during  the  act  of  vector  blood-feeding  upon  a  vertebrate  host. 
Upon  blood  meal  ingestion,  a  large  number  of  events  are  induced,  including  digestion, 
metabolism,  diuresis,  and  ultimately  oogenesis.  Unlike  the  arboviruses,  Plasmodium  or 
Borrelia,  Leishmania  can  complete  the  necessary  developmental  changes  and  propagate 
to  numbers  sufficient  for  transmission  and  infection  solely  within  the  confines  of  the 
midgut  tissue  of  the  sand  fly  [2].  Several  sand  fly  proteases  involved  in  blood  meal 
digestion  and  implicated  in  the  species  specificity  between  Leishmania  and  the  respective 
vectors  have  been  characterized  and  include  trypsins,  chymotrypsins  and  chitinases  from 
both  Lu.  longipalpis  and  Phlebotomus  papatasi  [3,  4]. 

A  more  global  approach  to  identifying  and  characterizing  sand  fly  molecules  has 
been  accomplished  through  the  sequencing  of  whole  sand  fly-derived  expressed 
sequence  tags  [5].  While  that  study  contributes  to  the  knowledge  of  the  molecular 
components  of  the  sand  fly,  it  does  not  provide  the  specific  molecules  of  the  midgut 
tissue  that  would  interact  with  the  developing  parasites.  The  construction  and  sequencing 
of  midgut  tissue-specific  cDNA  libraries  aims  therefore,  to  identify  those  molecules 
involved  in  blood  meal  digestion  and  metabolism,  peritrophic  matrix  formation  and 
possible  parasite  associations.  Here,  we  have  generated  and  sequenced  five  cDNA 
libraries  from  the  midgut  tissue  of  Lu.  longipalpis,  investigated  the  molecules  present 
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during  sugar  and  blood-feeding  as  well  as  after  the  blood  meal  has  been  processed  and 
excreted,  both  in  the  presence  and  absence  of  L.  infantum  chagasi.  In  addition  to  the 
identification  of  midgut-associated  molecules,  sequence  analysis  and  phylogenetic 
comparison  of  the  sequences  of  Lu.  longipalpis  allows  a  better  understanding  of  blood 
meal  processing  in  sand  flies  and  the  differences  between  visceral  ( Lutzomyia 
longipalpis)  and  cutaneous  leishmaniasis  ( Phlebotomus  papatasi)  sand  fly  vectors. 

Results  and  discussion 

As  the  midgut  is  the  primary  organ  of  the  sand  fly  in  which  the  Leishmania 
parasite  develops,  cDNA  libraries  of  the  midgut  tissue  were  constructed,  sequenced  and 
analyzed  to  investigate  the  molecules  present  that  may  provide  for  important  interactions 
between  these  two  organisms.  In  total,  five  cDNA  libraries  were  constructed  from  the 
midgut  tissue  of  female  Lu.  longipalpis  during  different  conditions  of  feeding  and 
digestion.  These  conditions  included  one  library  combining  the  midguts  from  sand  flies 
allowed  to  feed  on  a  sucrose  solution  (SF),  a  pool  of  midgut  tissue  from  sand  flies  fully 
engorged  from  an  artificial  blood  meal  1,  2  and  3  days  post-blood  meal  ingestion  (BF), 
and  a  pool  of  midguts  from  gravid  sand  flies  5,  6  and  7  days  post-blood  meal  digestion 
(PBMD).  The  conditions  chosen  and  the  pooling  of  those  times  after  blood  meal 
ingestion  allows  better  coverage  of  the  most  abundant  molecules  transcribed  in  the 
midgut  as  well  as  a  comparison  of  the  molecules  present  prior  to  blood-feeding,  while  the 
blood  bolus  is  present,  during  digestion  of  the  blood  meal  and  after  the  blood  byproducts 
have  been  excreted.  Two  cDNA  libraries  were  constructed  from  the  equivalent  pools  of 
time  points  after  blood-feeding  in  Lu.  longipalpis  midgut  tissue  from  sand  flies  that  had 
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ingested  amastigote-infected  macrophages  in  an  artificial  blood  meal  (BFi  and  PBMDi),  a 
more  natural  presentation  of  parasites  to  the  blood-feeding  sand  fly. 

Once  constructed,  approximately  2300  phage  plaques  were  picked  and  ultimately 
sequenced  for  each  of  the  five  cDNA  libraries;  generating  a  total  of  960 1  high  quality 
sequences  from  the  midgut  tissue  of  Lu.  longipalpis.  These  sequences  have  been 
submitted  to  the  NCBI  EST  database  under  the  accession  numbers  EW987149  - 
EW996682.  Table  7  summarizes  the  results  of  sequence  quality  and  bioinformatics 
analysis  of  each  library  and  the  combination  of  all  libraries  by  the  number  of  sequences 
analyzed  the  number  of  high  quality  sequences  used  in  the  bioinfonnatics  analysis,  the 
number  of  contigs,  the  number  of  singletons  and  the  average  number  of  sequences  per 
contig.  Each  library  generated  a  similar  number  of  sequences,  and  sequence  recovery 
from  the  phage  plaques  ranged  from  79-85%.  After  discarding  low  quality  sequences, 
each  library  retained  71-80%  sequences  with  an  average  of  73%  of  the  total  1 1,520  phage 
producing  high  quality  sequence  data.  Clustering  similar  sequences  into  contigs,  based 
on  sequence  homology,  produced  a  comparable  number  of  contigs  for  each  library  as 
well  as  a  similar  number  of  singletons.  The  comparable  number  of  high  quality 
sequences,  contigs  and  singletons  produced  from  each  library  allows  for  a  better 
comparison  between  the  sequence  abundance  of  specific  molecules  of  interest  and  the 
respective  biological  condition  of  the  midgut  under  which  they  were  recovered.  The 
average  number  of  sequences  in  the  clusters  of  contigs  varied  slightly  between  libraries. 
The  BF,  PBMD  and  PBMDi  cDNA  libraries  contained  an  average  sequence  per  cluster 
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Table  7:  Overall  examination  of  the  5  individual  cDNA  libraries  and  the  combined 


analysis 


SF 

BF 

BFi 

PBMD 

PBMDi 

Combined 

Sequences  analyzed 

1822 

1970 

1928 

1953 

1928 

9601 

High  quality  sequences 

1646 

1845 

1683 

1650 

1647 

8471 

Contigs 

148 

137 

156 

125 

117 

655 

Singletons 

631 

694 

694 

638 

622 

2279 

Sequences/contig 

6.86 

8.40 

6.34 

8.10 

8.76 

9.45 
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Amino  acid  transport  and  metabolism 
Carbohydrate  transport  and  metabolism 
Cytoskeleton 
Energy  production  and  conversion 
Function  unknown 
Inorganic  ion  transport  and  metabolism 
Intracellular  trafficking  and  secretion 
Lipid  transport  and  metabolism 
Nucleotide  transport  and  metabolism 
Protein  modification  and  turnover 
RNA  processing  and  modification 
Signal  transduction  mechanisms 
Transcription 
Translation 
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Number  of  sequences 

Figure  23.  Histograph  of  the  number  of  sequences  grouped  into  functional  classes  from 
the  sugar-fed,  blood-fed  and  post-blood  meal  digestion  cDNA  libraries. 

Sequences  from  clusters  of  those  three  cDNA  libraries,  with  an  E-value  less  than 
10E-5  result  of  the  COG  BLAST  grouped  into  the  general  functional  class  as  assigned  by 


COG. 
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ratio  of  8.4,  8.10  and  8.76,  respectively.  The  SF  cDNA  library  had  a  sequence  per  cluster 
ratio  of  6.86,  and  the  BFi  cDNA  library  produced  an  average  of  6.34  sequences  per 
cluster.  The  combining  of  all  cDNA  library  sequences  produced  655  contigs,  2279 
singletons  and  an  average  of  9.45  sequences  per  contig.  Each  cluster  was  assigned  a 
putative  function  and  placed  in  a  functional  class  based  on  the  sequence  homology  to 
molecules  identified  by  the  BLAST  results  from  the  NCBI  non-redundant  protein,  the 
Gene  Ontology,  the  conserved  domain,  rRNA  and  mitochondrial  databases.  Figure  23 
shows  an  overall  view  of  sequence  abundance  of  functional  classes  that  occur  during  the 
processes  of  sugar-feeding,  blood-feeding  and  after  the  digestion  of  the  blood  meal.  The 
clusters  of  those  three  cDNA  libraries,  with  an  E-value  less  than  10E-5  as  detennined  by 
KOG  BLAST,  were  grouped  according  to  the  general  functional  class.  Although  this  is  a 
summation  of  a  large  number  of  different  clusters,  the  total  number  of  sequences  in  each 
functional  class  can  highlight  overall  trends  that  are  potentially  important  in  the  processes 
of  blood-feeding  and  digestion. 

Following  is  a  more  detailed  description  of  the  most  abundant  transcripts 
identified  in  this  analysis: 

Proteases 

Proteases  were  among  the  most  abundant  transcripts  captured  in  the  random 
sequencing  of  the  midgut  cDNA  libraries  and  included  trypsin-like  serine  proteases, 
chymotrypsins,  carboxypeptidases,  and  an  astacin-like  me tallopro tease.  Table  8  shows 
the  putative  proteases  identified  in  the  midgut  transcrip  tome.  The  Sanger  Institute’s 
Lutzomyia  longipalpis  EST  database  was  searched  using  BLAST  to  find  the  best  matches 
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Table  8:  Putative  midgut-associated  proteases;  best  matched  results  and  corresponding 
E-values  from  BLAST  inquiries  of  a  GenBank-derived  non-redundant  protein  database 


and  Lutzomyia  longipalpis  EST  database 


Cluster 

Best  match  to  non-redundant  protein  database 

NR  E 

value 

Best  match  to  Lutzomyia 
EST  database 

Lutzomyia  E  value 

GenBank 

35 

trypsin  4  [P.  papatasi] 

I.E-101 

NSFM-6laOI 

3.E-I40 

ABM26904 

18 

trypsin  1  [P.  papatasi ] 

8.E-79 

SFM-03g02 

3.E-I32 

ABM26905 

83 

trypsin  3  [P.  papatasi ] 

6.E-94 

NSFM-I  1  3g08 

I.E-I  25 

EU 124590 

60 

trypsin  2  [P.  papatasi ] 

4.E-67 

NSFM-48a06 

2.E-I37 

EU 124582 

291 

trypsin-eta,  putative  [A  aegypti ] 

7.E-55 

NSFM-I  5d03 

6.E-I57 

EU  1 24595 

33 

chymotrypsin  [P.  papatasi] 

I.E-96 

NSFM-95b07 

5.E-I43 

EU  1 24576 

32 

chymotrypsin  [P.  papatasi] 

8.E-97 

NSFM-6  If07 

1  .E- 1  37 

EU  1 24575 

64 

larval  chymotrypsin-like  protein  precursor  [A.  aegypti] 

I.E-79 

SFM-0lb03 

I.E-I  30 

EU  124583 

87 

chymotrypsin  [P.  papatasi] 

3.E-79 

NSFM-96h06 

2.E-I48 

EU  124591 

30 

chymotrypsin  [P.  papatasi] 

2.E-94 

NSFM-29b07 

5.E-I33 

EU  124573 

31 

chymotrypsin  [P.  papatasi] 

5.E-96 

NSFM-I  29f09 

I.E-I4I 

EU  1 24574 

58/59 

ENSANGP000000 19623  [A.  gambiae] 

3.E-57 

NSFM-I  21  h  10 

6.E-I3I 

EU  124581 

104 

carboxypeptidase  [A.  aegypti] 

1  .E- 1 26 

SFM-05c  1  1 

I.E-221 

EU  124592 

107 

carboxypeptidase  [A.  aegypti] 

I.E-I  14 

NSFM-I  46a05 

I.E-226 

EU  124593 

91 

similar  to  CG8560-PA  [7.  castaneum] 

2.E-82 

NSFM-32d09 

8.E-I99 

EU  124594 

Table  9:  Putative  midgut-associated  proteases;  putative  function  and  sequence 

distribution  contributed  from  each  cDNA  library 

Cluster 

Clone 

Putative  function 

SF 

BF 

Number  of  sequences 

BFi  PBMD 

PBMDi 

Total 

35 

LJGFiM23_B07 

Trypsin 

3 

55 

34 

0 

0 

92 

18 

LJGUL-P03G08 

Trypsin 

136 

6 

15 

109 

168 

434 

83 

LJGFM-P03_EI  1 

Trypsin 

8 

7 

3 

7 

1 

26 

60 

LJGU-I-5_D05 

Trypsin 

8 

4 

8 

10 

8 

38 

291 

LJGFiM26_A0l 

Serine  protease 

2 

0 

1 

0 

2 

5 

33 

LJGFM-P04  B0 1 

Chymotrypsin 

3 

51 

22 

1 

0 

77 

32 

LJGFIL 1 0_C  10 

Chymotrypsin 

0 

2 

5 

0 

0 

7 

64 

LJGFM-P03C0I 

Chymotrypsin 

0 

17 

17 

0 

1 

35 

87 

LJGF-I-8E03 

Chymotrypsin 

1 

14 

8 

0 

2 

25 

30 

LJGDIL5_B09 

Chymotrypsin 

12 

1 

3 

1 

4 

21 

31 

LJGFM-P0I_C04 

Chymotrypsin 

3 

4 

2 

0 

0 

9 

58/59 

LJGUL-P0 1  B07 

Astacin-like 

metalloprotease 

28 

7 

1 

1 

4 

41 

104 

LJGFLP0IF0I 

Carboxypeptidase 

0 

14 

3 

0 

0 

17 

107 

LJGFL_P03_G  1  1 

Carboxypeptidase 

6 

5 

7 

0 

0 

18 

91 

LJGFiM22_C05 

Carboxypeptidase 

1 

8 

8 

1 

1 

19 
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and  results  are  shown  with  the  corresponding  E-value.  The  proteases  described  here  are 
most  similar  to  those  described  in  the  sand  fly  Phlebotomus  papatasi  and  the  mosquitoes 
Aedes  aegypti  or  Anopheles  gamble ,  with  the  exception  that  cluster  91  encodes  a  putative 
carboxypeptidase  that  shares  homology  with  a  molecule  from  the  beetle  Tribolium 
castaneum.  Table  9  shows  the  transcript  producing  a  full  length,  high  quality  sequence 
for  each  cluster  and  the  putative  function  of  the  identified  transcripts.  The  number  of 
sequences  that  each  cluster  contributed  to  each  of  the  cDNA  libraries  also  is  shown  and 
from  this,  it  can  be  seen  that  proteases  are  more  abundant,  as  expected,  in  the  blood-fed 
(BF)  and  blood-fed  Leishmania- infected  (BFi)  libraries.  An  interesting  observation  is 
that  cluster  18,  which  encodes  a  putative  trypsin,  is  more  abundant  in  the  SF,  PBMD  and 
PMBDi  cDNA  libraries,  indicating  that  this  putative  trypsin  may  have  a  role  other  than 
blood  meal  digestion  or  is  produced  and  stored  prior  to  the  ingestion  of  a  blood  meal. 
Table  10  describes  the  predicted  localization,  molecular  weight  and  isoelectric  point  of 
these  proteases.  All  of  the  identified  proteases  posses  a  potential  signal  peptide,  and  the 
molecular  weight  and  isoelectric  point  given  is  that  of  the  predicted  mature  protein. 

Trypsin 

Four  trypsin-like  transcripts  were  identified  in  the  transcriptome  with  high 
homology  to  the  described  P.  papatasi  midgut  trypsins  [3,  6].  Clusters  18,  35,  60  and  83 
are  similar  to  P.  papatasi  Pptrypl,  Pptryp4 ,  Pptryp2  and  PptiypS,  respectively. 

Recently,  two  transcripts  from  Lu.  longipalpis  midgut  EST  sequencing  were  partially 
characterized  and  named  Lltrypl,  which  corresponds  with  Cluster  35  identified  in  our 
cDNA  libraries,  and  Lltiyp2,  which  corresponds  with  Cluster  18  [3].  Lltiyp2  was  found 
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Table  10:  Putative  midgut-associated  proteases;  localization,  molecular  weight  and 
isoelectric  point  of  putative  midgut  proteins 


Cluster 

Putative  function 

Gene  name 

Localization 

Molecular  weight  (kDa) 

Isoelectric  point 

35 

Trypsin 

Utrypl 

Secreted 

26.2 

6.32 

18 

Trypsin 

Lltryp2 

Secreted 

26.0 

4.95 

83 

Trypsin 

LuloTryp3 

Secreted 

26.0 

5.67 

60 

Trypsin 

LuloTryp4 

Secreted 

26.1 

5.52 

291 

Serine  protease 

LuloSerPro 

Secreted 

29.0 

8.26 

33 

Chymotrypsin 

LuloChym  IA 

Secreted 

26.8 

6.55 

32 

Chymotrypsin 

LuloChym  1 B 

Secreted 

26.6 

6.40 

64 

Chymotrypsin 

LuloChym  2 

Secreted 

25.8 

6.74 

87 

Chymotrypsin 

LuloChym  3 

Secreted 

27.6 

4.77 

30 

Chymotrypsin 

LuloChym  4 

Secreted 

26.9 

5.86 

31 

Chymotrypsin 

LuloChym  5 

Secreted 

26.9 

6.19 

58/59 

Astacin-like  metalloprotease 

LuloAstacin 

Secreted 

28.0 

5.04 

104 

Carboxypeptidase 

LuloCpepA  1 

Secreted 

45.8 

5.36 

107 

Carboxypeptidase 

LuloCpepA2 

Secreted 

46.0 

5.41 

91 

Carboxypeptidase 

LuloCpepB 

Secreted 

45.9 

4.73 

Ill 


in  highest  abundance,  434  sequences,  with  the  unique  sequence  distribution  among  the 
five  cDNA  libraries  in  that  most  sequences  were  contributed  by  the  sugar-fed  and  post¬ 
blood  meal  digestion  groups.  Sequence  abundance  of  trypsins  varied;  listed  in  order  of 
decreasing  abundance  are  Lltrypl,  LuloTryp4 ,  and  LuloTryp3.  LuloTryp3  and  LuloTryp4 
had  relatively  homogenous  sequence  distribution  among  the  cDNA  libraries,  although 
LuloTryp3  was  underrepresented  in  the  PMBDi  cDNA  library  with  only  one  sequence 
identified.  The  distribution  of  Lltrypl  sequences  between  the  cDNA  libraries  correlates 
with  reverse  transcriptase-PCR  results  published  showing  the  expression  of  Lltrypl 
during  the  presence  of  a  blood  meal  in  the  female  sand  fly  midgut  [3].  Further 
information  about  the  putative  trypsin  molecules  can  be  found  in  Table  10,  showing  the 
range  of  molecular  weight  from  26.0  to  26.2  kDa.  The  isoelectric  points  (pi)  of  these 
putative  trypsins  vary  with  Lltrypl  having  a  higher  pi  of  6.32,  Lltryp  with  a  lower  pi  of 
4.95,  and  LuloTryp3  and  LuloTryp4  having  similar  pis  of  5.67  and  5.52,  respectively. 
Phylogenetic  analysis  of  amino  acid  sequences  from  Dipteran  trypsin  molecules  and  a 
trypsin  from  Blattella  germanica  resulted  in  two  major  clades,  one  containing  the  An. 
garnbiae  trypsin  molecules  (group  I)  and  another  containing  the  remaining  sequences. 
Within  the  other  major  clade  the  sand  fly  trypsins  from  Lu.  longipalpis  and  P.  papatasi 
form  two  subclades  (Group  II)  (Figure2A).  As  previously  published  [3],  Pptrypl  and 
Pptryp2  form  a  clade  apart  from  the  clade  containing  Pptryp3  and  Pptryp4.  The  putative 
trypsin  molecules  identified  in  Lu.  longipalpis  midgut  share  a  high  homology  with  the  P. 
papatasi  molecules,  being  grouped  into  the  same  clades.  Multiple  sequence  alignment  of 
the  trypsin  molecules  of  Lu.  longipalpis  depicts  the  potential  secretory  signal  peptide,  the 
H/D/S  catalytic  site  residues  and  substrate  specifying  residues  (Figure  24B). 
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Figure  24.  Sequence  analysis  of  trypsin-like  serine  proteases. 

(A)  Phylogenetic  analysis  of  amino  acid  sequences  from  Anopheles  gambiae 
(Antryp),  Culicoides  sonorensis  (Cs),  Blattella  germanica  (Bg),  Lutzomyia  longipalpis 
(Lulo  and  LI),  Phlebotomus  papatasi  (Pp),  Aedes  aegypti  (Aa),  Drosophila  melanogaster 
(Dm)  and  Culex  pipiens  quinquefasciatus  (Cp).  Node  support  is  indicated  by  bootstrap 
values  and  accession  numbers  given  in  parenthesis.  (B)  Multiple  sequence  alignment  of 
Lutzomyia  longipalpis  putative  trypsin  molecules.  Predicted  secretion  signal  peptides  are 
underlined,  catalytic  residues  marked  by  (*)  and  residues  determining  substrate 
specificity  marked  by  (#). 
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A  novel  midgut-associated  serine  protease,  LuloSerPro,  was  identified  in  the 
sequencing  and  annotation  of  these  midgut  cDNA  libraries.  LuloSerPro  is  predicted  to 
be  secreted  and  have  a  mature  molecular  weight  of  29.0  kDa,  slightly  larger  than  the 
other  trypsin-like  serine  proteases  in  the  midgut,  and  has  an  unusually  high  predicted  pi 
of  8.26  (Table  10).  This  molecule,  while  found  in  low  abundance,  was  present  in  the 
sugar-fed,  blood-fed  Leishmania- infected,  and  post-blood  meal  digestion -Leishmania- 
infected  cDNA  libraries  (Table  9).  Phylogenetic  analysis  and  multiple  sequence 
alignments  of  the  midgut  trypsin  molecules  and  LuloSerPro  show  that  while  this 
molecule  is  very  similar  to  other  trypsin  molecules  and  retains  the  catalytic  residues,  this 
is  a  distinctly  different  serine  protease  (Figure  24).  Additionally,  there  is  a  difference  in 
the  residues  that  determine  the  substrate  specificity  (Lys  to  Val)  between  the  other 
midgut  trypsins  and  LuloSerPro  (Figure  24B). 

Chymotrypsin 

Chymotrypsin  is  another  serine  protease  found  in  abundance  in  the  midgut  of  this 
hematophage.  This  study  identified  five  clusters  with  homology  to  chymotrypsin 
molecules  described  in  P.  papatasi  and  one  cluster  with  homology  to  a  putative  larval 
chymotrypsin  found  in  Ae.  aegypti  (Tables  8-10).  Clusters  33,  32,  64,  87,  30  and  31  were 
named  LuIoChymlA,  LuloChymlB,  LuloChym2,  LuloChym3,  LuIoChym4  and 
LuloChym5,  respectively.  LuloChym4  was  found  in  higher  abundance  in  the  sugar-fed 
cDNA  library  and  LuloChym5  sequences  were  found  in  relatively  equal  numbers  between 
blood-fed  and  sugar-fed  cDNA  libraries.  In  contrast  the  other  chymotrypsin  molecules 
appear  in  highest  abundance  in  the  blood-fed  and  blood-fed  Leishmania- infected  cDNA 
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libraries  (Table  9).  According  to  sequence  numbers  between  the  cDNA  libraries  it 
appears  that  chymotrypsin  transcription  is  quiescent  after  the  blood  meal  has  been 
digested  and  excreted.  The  Lu.  longipalpis  chymotrypsin  sequences  have  a  predicted 
molecular  weight  of  secreted  protein  ranging  from  25.8  to  27.6  kDa  (Table  10). 

Phylogenetic  analysis  of  chymotrypsin  amino  acid  sequences  show  that  there  is 
conservation  in  sequence  homology  between  Lu.  longipalpis  chymotrypsin  and  P. 
papatasi  chymotrypsin  molecules  (Figure  25A).  LuloChyml  A,  LuloChymlB, 
LuloChym4  and  LuloChym5  form  a  subclade  within  a  clade  containing  only  sand  fly 
chymotrypsin  molecules.  The  short  phylogenetic  distance  between  LuloChyml  A  and 
LuloChymlB  and  the  95%  amino  acid  identity  they  share  suggests  that  these  transcript 
sequences  may  represent  polymorphisms.  Further  comparisons  between  the  amino  acid 
sequences  of  the  midgut-associated  chymotrypsin  molecules  show  that  the  cysteine  and 
catalytic  residues  H/D/S  are  conserved  (Figure  25B). 

Carboxypeptidases 

The  three  longest  transcripts  encoding  putative  proteases  identified  in  the  analysis 
are  similar  to  zinc  metallocarboxypeptidases  found  in  other  insects  as  well  as  having 
significant  similarity  to  ESTs  from  the  Sanger  Institute  database  (Table  8).  These 
transcripts  from  clusters  104,  107  and  91,  were  named  LuloCpepAl ,  LuloCpepA2  and 
LuloCpepB,  they  have  molecular  weights  of  45.8,  46.0  and  45.9  kDa  and  a  pi  of  5.36, 
5.41  and  4.73,  respectively  (Table  10).  Although  LuloCpepA2  appears  to  be  an 
incomplete  transcript  with  a  5’  truncation,  based  on  homology  and  predicted  signal 
peptide  sequences,  a  putative  mature  protein  can  be  used  in  further  characterization  and 
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Figure  25.  Chymotrysin  sequence  analysis. 

(A)  Phylogenetic  analysis  of  chymo trypsin  sequences  from  Phlebotomus  papatasi 
(Pp),  Lutzomyia  longipalpis  (Lulo),  Anopheles  gambiae  (Ag),  Aedes  aegypti  (Aa),  and 
Culicoides  sonorensis  (Cs).  Accession  numbers  are  shown  in  parentheses  and  node 
support  indicated  by  the  bootstrap  values.  (B)  Sequence  comparison  of  midgut  putative 
chymotrypsin  molecules.  The  probably  signal  peptide  is  underlined,  the  catalytic 
residues  indicated  by  (*)  and  conserved  cysteine  residues  marked  with  Q). 


<  < 
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comparison.  Most  of  the  sequences  grouped  to  produce  the  carboxypeptidase  clusters 
were  captured  from  the  blood-fed  library,  suggesting  that  these  molecules  are  likely 
induced  by  the  ingestion  or  presence  of  blood  in  the  midgut  of  the  sand  fly  (Table  9). 

The  classification  of  these  molecules  as  members  of  the  A  or  B  class  of 
metallocarboxypeptidases  was  determined  by  the  output  from  phylogenetic  analysis  of 
the  amino  acid  sequences  (Figure  26A).  The  phylogenetic  tree  produced  by  this  analysis 
shows  distinct  clades  containing  insect  sequences  nearly  all  annotated  as  either 
carboxypeptidase  A  or  carboxypeptidase  B  molecules.  The  high  node  support  values  of 
the  sand  fly  carboxypeptidases  in  the  phylogenetic  tree  imply  conservation  of  these 
molecules  when  comparing  the  Old  World  sand  fly  P.  papatasi  and  the  New  World  sand 
fly  Lu.  longipalpis.  Similarity  between  the  two  sand  flies,  with  regards  to  the 
carboxypeptidase  molecules,  can  be  seen  in  amino  acid  sequence  alignments  depicting 
the  high  level  of  identity  and  retention  of  the  catalytic  residues  necessary  for 
metallocarboxypeptidase  activity  (Figure  26B,  26C).  Furthermore,  the  amino  acid 
sequence  alignment  depicts  the  incongruousness  that  separates  LuloCpepAl  from 
LuloCpepA2  (Figure  26B). 

Astacin 

A  putative  zinc  metalloprotease  was  identified  as  a  likely  astacin-like  molecule 
based  on  results  from  a  search  of  the  conserved  domains  database.  This  astacin  molecule 
was  derived  from  clusters  58  and  59,  both  encoding  the  same  putative  protein,  but 
separated  due  to  differing  lengths  of  5’-  and  3’-  UTRs  by  the  bioinformatics  software. 
The  astacin-like  metalloprotease  was  named  LuloAstacin  and  is  predicted  to  have  a 
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Figure  26.  Analysis  of  putative  carboxypeptidase  molecules. 

(A)  Phylogenetic  analysis  of  carboxypeptidases  from  Lutzomyia  longipalpis 
(Lulo),  Phlebotomus  papatasi  (Pp),  Ochlerotatus  triseriatus  (Ot),  Aedes  aegvpti  (Ae), 
Anopheles  gambiae  (Ag),  Drosophila  melanogaster  (Dm),  Tribolium  castaneum  (Tc), 
Tenebrio  molitor  (Tm)  and  Culicoides  sonorensis  (Cs).  GenBank  accession  numbers  are 
in  parentheses  and  node  support  is  indicated  by  bootstrap  values.  (B)  Sequence 
alignment  of  putative  carboxypeptidase  A  molecules  identified  from  the  midgut  of 
Lutzomyia  longipalpis  (Lulo)  and  Phlebotomus  papatasi  (Pp).  Predicted  catalytic 
residues  are  marked  with  (*).  (C)  Sequence  alignment  of  putative  carboxypeptidase  B 
molecules  identified  from  the  midgut  of  Lutzomyia  longipalpis  (Lulo)  and  Phlebotomus 
papatasi  (Pp).  Predicted  catalytic  residues  are  marked  with  (*). 
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Figure  27.  Astacin-like  metalloprotease  sequence  comparison  and  analysis. 

(A)  Phylogenetic  analysis  of  amino  acid  sequences  from  Lutzomyia  longipalpis 
(Lulo),  Phlebotomus  papatasi  (Pp),  Mus  musculus  (Mm),  Homo  sapiens  (Hs),  Glossina 
morsitans  morsitans  (Gm),  Drosophila  melanogaster  (Dm),  Aedes  aegypti  (Aea), 
Caenorhabditis  elegans  (Ce),  Anopheles  gambiae  (Ag),  Astacus  astacus  (Asa)  and 
Culicoides  sonorensis  (Cs).  Node  support  is  indicated  by  the  bootstrap  values.  (B) 
Multiple  sequence  alignment  of  Dipteran  astacin-like  molecules.  Predicted  signal  peptide 
sequence  is  underlined  and  the  residues  likely  necessary  for  catalytic  activity  are  marked 


with  (*). 
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molecular  weight  or  28  kDa  once  secreted  and  pi  of  5.36  (Table  10).  LuloAstacin  was 
most  abundant  in  the  sugar-fed  cDNA  library  in  contrast  to  PpAstacin,  an  astacin-like 
molecule  identified  in  P.  papatasi  midgut,  which  was  most  abundant  in  the  blood-fed 
cDNA  library  (Table  9).  Phylogenetic  analysis  of  other  putative  astacin  amino  acid 
sequences  illustrate  that  one  clade  is  an  assemblage  of  the  Dipteran  sequences. 
LuloAstacin  branches  out  of  the  subclade  containing  PpAstacin  and  away  from  the  other 
Dipteran  sequences  (Figure  27A).  Further  differences  in  amino  acid  sequence  can  be 
visualized  in  the  multiple  sequence  alignment  of  Dipteran  astacins  and  while  LuloAstacin 
diverges  from  the  other  astacin  molecules,  the  residues  responsible  for  zinc-binding  and 
activity  are  conserved  (Figure  27B). 

Peritrophin-like  proteins 

A  number  of  molecules  were  identified  as  containing  chitin-binding  domains 
based  on  results  from  the  conserved  domains  database  (Tables  11-13).  Three  of  the 
transcripts  resembled  previously  identified  peritrophin  molecules  based  on  sequence 
homology  with  peritrophin- A  domains.  The  most  abundant  of  these  putative  peritrophin 
transcripts  was  named  LuloPerl  (Cluster  77/78)  and  was  overrepresented  in  the  blood- 
fed  Leishmania- infected  cDNA  library  and  likely  encodes  a  secreted  protein  of  27.8  kDa 
(Tables  12  and  13).  LuloPerl  consists  of  four  chitin-binding  domains  (Figure  28A); 
contrasting  the  other  two  peritrophin  molecules,  LuloPer2  and  LuloPer3,  which  are 
molecules  of  a  single  chitin-binding  domain  (Figure  28).  LuloPer2  and  LuloPer3 
sequences  originated  in  higher  numbers  from  blood-fed  midgut  cDNA  libraries  and  were 


Table  11:  Putative  midgut- associated  peritrophin  proteins;  best  matched  results  and 
corresponding  E-values  from  BLAST  inquiries  of  a  GenBank-derived  non-redundant 
protein  database  and  Lutzomyia  longipalpis  EST  database 
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Cluster 

Best  match  to  non-redundant  protein  database 

NR  E  value 

Best  match  to  Lutzomyia 
EST  database 

Lutzomyia  E  value 

GenBank 

77/78 

similar  to  CG7248-PA  [7.  castaneum ] 

I.E-17 

NSFM-1 14b  12 

5.E-I55 

EU 124588 

1 14 

similar  to  CG4778-PA  [7  castaneum ] 

I.E-12 

NSFM-67f02 

4.E-06 

EU 124602 

171 

ENSANGP0000001 3237  [A  gambiae ] 

4.E-I0 

NSFM-67f02 

6.E-07 

EU 124607 

274 

conserved  hypothetical  protein  [A  aegypti ] 

6.E-67 

NSFM-35cl  1 

3.E-28 

EU 1 246  1 6 

Table  12:  Putative  midgut-associated  peritrophin  proteins;  putative  function  and 


sequence  distribution  contributed  from  each  cDNA  library 


Number  of  sequences 

Cluster 

Clone 

Putative  function 

SF 

BF 

BFi 

PBMD 

PBMDi 

Total 

77/78 

LJGFiM27_H09 

Peritrophin 

0 

6 

22 

0 

0 

28 

1 14 

LJGUM-P04GI0 

Peritrophin 

1 

7 

9 

0 

0 

17 

171 

LJGFL_P03_H05 

Peritrophin 

1 

4 

4 

1 

0 

10 

274 

LJGFiM24_D03 

Chitin  binding 

1 

0 

4 

1 

0 

6 

Table  13:  Putative  midgut-associated  peritrophin  proteins;  localization,  molecular 


weight  and  isoelectric  point  of  putative  midgut  proteins 


Cluster 

Putative  function 

Gene  name 

Localization 

Molecular  weight  (kDa) 

Isoelectric  point 

77/78 

Peritrophin 

LuloPer  1 

Secreted 

27.8 

5.00 

1  14 

Peritrophin 

LuloPer2 

Secreted 

9.2 

4.38 

171 

Peritrophin 

LuloPer3 

Secreted 

7.5 

3.80 

274 

Chitin  binding 

LuloChiBi 

Secreted 

20.9 

6.65 

125 


Figure  28.  Characterization  of  peritrophin  sequences. 

(A)  Diagrammatic  representation  of  Lutzomyia  longipalpis  peritrophin-like 
molecules  showing  the  predicted  signal  peptide  and  chitin  binding  domains.  (B) 
Phylogenetic  analysis  of  predicted  chitin  binding  domains  of  peritrophin  molecules  from 
Aedes  aegypti  (Ae),  Anopheles  gambiae  (Ag)  Ctenocephalides  felis  (Cf),  Lucilia  cuprina 
(Luc),  Phlebotomus  papatasi  (Pp),  Lutzomyia  longipalpis  (Lulo).  Accession  numbers  are 
given  in  parentheses  and  bootstrap  values  indicate  node  support. 
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80 - Ae  Aper50D5  (AAL05409) 

- Ag  Aperl  D2  (AF030431) 

r  Ae  Aper50D3  (AAL05409) 

Ae  Aper50D4  (AAL05409) 

-  Ae  Aper50D2  (AAL05409) 

Ae  IMUC1D1  (AF308863) 

Ae  IMUC1D2  (AF308863) 


70 


67 


87 
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■  Ae  IMUC1D3  (AF308863) 

- Ae  Aper50D1  (AAL05409) 


84 


■  Ag  Aperl  D1  (AF030431) 

- LuloPerl  D2  (EU124588) 

- Pp  Perl  D2  (EU031912) 

- LuloPerl  D1  (EU124588) 


78 


■  Pp  PerlDI  (EU031912) 


77 


C 


LuloPerl  D3  (EU1 24588) 

-  Pp  Perl  D3  (EU031912) 


63j - LuloPer3  (EU1 24607) 

I - Pp  Per2  (EU047543) 

- LuloPer2  (EU124602) 


65 


■  Pp  Per3D2  (EU045354) 

■  CTPL1D2  (AF373879) 


67 


- Luc  PerD5  (L25106) 

■  LuloPerl  D4  (EU1 24588) 
- Pp  Perl D4  (EU031912) 


•  Luc  PerD4  (L25106) 

- a  PL2D2  (AF373880) 


■  Luc  PerD3  (L25106) 

- a  PL2D1  (AF373880) 


Luc  PerDI  (L25106) 
- Luc  PerD2  (L25106) 


-Cf  PL3  (AF373881 ) 

- Cf  PL1D1  (AF373879) 

- Pp  Per3D1  (EU045354) 
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in  relatively  equal  numbers  between  the  infected  and  uninfected  sand  flies.  These  small 
putative  peritrophins  are  predicted  to  have  a  mature  molecular  weight  of  9.2  and  7.5  kDa 
and  isoelectric  points  of  4.38  and  3.8  for  LuloPer2  and  LuloPer3,  respectively  (Table  13). 
LuloPerl  is  likely  to  have  a  role  in  cross  linking  chitin  fibrils  that  will  form  the 
peritrophic  matrix  around  the  ingested  blood  bolus.  LuloPer2  and  LuloPer3  may  have 
roles  in  capping  the  ends  of  chitin  fibrils  or  sequestering  free  chitinous  molecules  within 
the  midgut  lumen.  However,  the  two  sequences  share  only  39%  identity  and  44% 
similarity,  conserving  primarily  the  cysteine  residues,  suggesting  they  may  have  very 
different  ligand  specificities  or  roles  in  peritrophic  matrix  formation  and/or  chitin 
management  within  the  midgut.  Phylogenetic  analysis  of  the  individual  chitin-binding 
domains  from  several  other  insect  peritrophin  and  mucin  molecules  demonstrate 
conservation  of  the  LuloPerl  domain  arrangement  when  compared  with  P.  papatasi 
PpPerl,  suggesting  that  if  the  domains  are  gene  duplication  events  then  those  events 
occurred  prior  to  speciation  (Figure  28).  Additionally,  the  small  putative  peritrophin 
molecules  domain  from  LuloPer2  and  LuloPer3  fonn  a  clade  containing  another  chitin¬ 
binding  domain  from  a  small  peritophin  of  P.  papatasi  (Figure  28). 

In  addition  to  the  putative  peritrophin  molecules  a  transcript  with  homology  to  a 
predicted  chitin-binding  domain  was  identified  from  the  clustering  of  6  sequences 
collected  primarily  from  the  blood-fed  Leishmania- infected  cDNA  library.  This  domain 
has  homology  to  a  much  larger  chitin-binding  domain  than  those  found  in  the  putative 
peritrophin  molecules.  The  identified  transcript,  LuloChiBi,  has  one  of  these  chitin- 
binding  domains  and  is  predicted  to  have  a  mature  molecular  weight  of  20.9  kDa  (Table 


13). 
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Microvillar  proteins 

Among  the  most  abundant  sequences  identified  in  the  cDNA  libraries  were 
transcripts  encoding  putative  microvillar-associated  proteins  with  homology  to  insect 
allergens  identified  in  Periplaneta  americana  and  Blattella  germanica  (Table  14).  By 
BLAST  analysis  high  homology  also  was  found  to  molecules  in  the  mosquito  Aedes 
aegypti.  In  order  of  decreasing  overall  sequence  abundance,  clusters  27,  29,  48,  66  and 
36  were  named  LuloMVPl,  LuIoMVP2,  LuloMVP3,  Lu/oMVP4  and  LuIoMVP5, 
respectively  (Table  15).  In  general,  the  microvillar  proteins  were  most  abundant  in  the 
blood-fed  cDNA  libraries;  although,  LuloMVP3  (cluster  48)  sequences  were 
underrepresented  in  the  blood-fed  cDNA  libraries  and  were  relatively  equally  identified 
in  the  sugar-fed  and  post-blood  meal  ingestion  cDNA.  LuloMVPl,  LuloMVP2  and 
LuloMVP5  have  nearly  equal  mature  molecular  weights  of  2 1  kDa  based  on  the  cleavage 
of  the  predicted  signal  peptide  present  in  all  of  the  microvillar  proteins  while  LuloMVP3 
and  LuloMVP4  are  slightly  larger,  around  23  kDa.  A  notable  difference  in  the  isoelectric 
point  among  the  microvillar  proteins  was  observed.  There  was  a  predicted  value  of  8.84 
for  LuloMVP3;  whereas,  the  other  microvillar  molecules  isoelectric  point  ranges  from 
4.46  to  5.12  (Table  16). 

The  Lu.  longipalpis  microvillar  proteins  share  respective  homology  with  similar 
molecules  identified  in  the  midgut  of  P.  papatasi,  as  demonstrated  by  amino  acid 
phylogenetic  analysis  (Figure  29).  The  sand  fly  microvillar  proteins  are  separated  from 
the  clade  containing  cockroaches.  Additionally,  LuloMVP2  and  LuloMVP5  are  in  a 
subclade  with  the  microvillar  proteins  of  Ae.  aegypti  and  An.  gambiae,  while  the  other 
molecules  pair  with  the  P.  papatasi  microvillar  proteins  (Figure  29A).  Sequence 


Table  14:  Putative  midgut-associated  microvillar  proteins;  best  matched  results  and 
corresponding  E-values  from  BLAST  inquiries  of  a  GenBank-derived  non-redundant 
protein  database  and  Lutzomyia  longipalpis  EST  database 
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Cluster 

Best  match  to  non-redundant  protein  database 

NR  E  value 

Best  match  to  Lutzomyia 
EST  database 

Lutzomyia  E  value 

GenBank 

27 

conserved  hypothetical  protein  [A.  aegypti ] 

2.E-46 

NSFM-I26el2 

2.E-I00 

EU 124571 

29 

conserved  hypothetical  protein  [A.  aegypti ] 

2.E-36 

NSFM-I9al0 

6.E-99 

EU  1 24572 

48 

Cr-PII  allergen  [P.  americana ] 

9.E-22 

NSFM-68e08 

I.E-IOI 

EU  1 24579 

66 

conserved  hypothetical  protein  [A.  aegypti ] 

4.E-27 

NSFM-47h07 

I.E-1 17 

EU  1 24584 

36 

putative  protein  G 12  [A.  aegypti ] 

6.E-4I 

NSFM-I54e02 

3.E-I06 

EU  1 24577 

Table  15:  Putative  midgut-associated  microvillar  proteins;  putative  function  and 


sequence  distribution  contributed  from  each  cDNA  library 


Cluster 

Clone 

Putative  function 

SF 

BF 

Number  of  sequences 

BFi  PBMD  PBMDi 

Total 

27 

LJGFiL9  BO  1 

Microvillar  protein 

5 

109 

55 

0 

0 

169 

29 

LJGFM9F05 

Microvillar  protein 

3 

87 

40 

0 

0 

130 

48 

LJGFSP0IC07 

Microvillar  protein 

15 

6 

5 

18 

18 

62 

66 

LJGFiM27_D08 

Microvillar  protein 

1 

24 

7 

0 

0 

32 

36 

LJGFS_P04_B0I 

Microvillar  protein 

1 

60 

27 

0 

0 

88 

Table  16:  Putative  midgut-associated  microvillar  proteins;  localization,  molecular 


weight  and  isoelectric  point  of  putative  midgut  proteins 


Cluster 

Putative  function 

Gene  name 

Localization 

Molecular  weight  (kDa) 

Isoelectric  point 

27 

Microvillar  protein 

LutoMVPl 

Secreted 

21.6 

5.09 

29 

Microvillar  protein 

LuloMVP2 

Secreted 

21.5 

5.12 

48 

Microvillar  protein 

LuloMVP3 

Secreted 

23.1 

8.84 

66 

Microvillar  protein 

LuloMVP4 

Secreted 

23.6 

4.46 

36 

Microvillar  protein 

LuloMVPS 

Secreted 

21.7 

4.67 
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Figure  29.  Sequence  analysis  of  microvillar  proteins. 

(A)  Phylogenetic  analysis  of  amino  acid  sequences  from  Blattella  germanica 
(Bg),  Periplaneta  americana  (Pa),  Tenebrio  molitor  (Tm),  Aedes  aegypti  (Aa),  Anopheles 
gambiae  (Ag),  Phlebotomus  papatasi  (Pp)  and  Lutzomyia  longipalpis  (Lulo).  Bootstrap 
values  indicated  node  support  and  accession  numbers  are  given  in  parentheses.  (B) 
Multiple  sequence  alignment  of  the  microvillar  proteins  of  Lutzomyia  longipalpis .  The 
predicted  signal  secretion  peptide  is  underlined. 
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r  Pa  (AAC34312) 

--  Pa  (AAC34737) 

- Pa  (AAC34736) 

84  |  Bg  (AAD13530) 

I  Bg  (AAD13532) 

—  Bg  (AAD13531) 


•Tm  (AAP92419) 


■  LuloMVP3  (EU124584) 
- PpMVP3  (EU047550) 


96  I  Aa  (AAK72505) 
L  Aa  (AAK72506) 


■  Ag  (CAA80505) 

,  | - LuloMVP2  (EU1 24572) 

I - PpMVP2  (EU047549) 

- LuloMVP5  (EU1 24579) 


■  LuloMVPI  (EU124571) 


PpMVPI  (EU031911) 


LuloMVP4  (EU1 24577) 
-  PpMVP4  (EU047551) 


LuloMVP2 

LuloMVP5 

LuloMVPI 
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LuloMVP3 

LuloMVP2 
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LuloMVP4 
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LuloMVP2 
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LuloMVP3 

LuloMVP2 

LuloMVP5 

LuloMVPI 

LuloMVP4 

LuloMVP3 
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alignment  of  the  Lu.  longipalpis  microvillar  proteins  show  little  sequence  homology, 
suggesting  that  the  classification  of  microvillar  proteins  is  rather  broad  and  perhaps  that 
these  molecules  have  different  functions  altogether. 

Oxidative  stress  molecules 

The  sand  fly,  being  an  obligate  blood-feeding  insect,  must  cope  with  the 
physiological  challenges  posed  by  the  digestion  of  blood,  which  includes  the  generation 
of  reactive  oxygen  species  (ROS)  released  by  free  heme  and  metabolic  radicals  produced 
in  abundance  during  the  digestion  of  the  blood  meal  [7].  Five  molecules  were  identified 
in  the  midgut  cDNA  libraries  that  have  putative  roles  as  antioxidants  such  as  glutathione 
s-transferase  (GST),  catalase,  copper-zinc  superoxide  dismutase  (SOD)  and 
peroxiredoxin  (PRX)  (Table  17).  In  addition  to  the  protection  these  molecules  may 
impart  on  the  regulation  of  ROS  due  to  blood  meal  digestion,  there  is  evidence  that 
antioxidants  interact  with  and  can  impact  the  outcomes  of  infection  by  bacterial  and 
parasitic  agents  [8].  Two  transcripts  were  identified  with  homology  to  GST  molecules 
of  the  Class  Sigma  and  Class  Delta/Epsilon  subfamilies  and  were  named  LuIoGSTl  and 
LuloGST2,  respectively.  Phylogenetic  analysis  of  the  putative  GST  molecules  supports 
the  separation  and  classification  into  the  subfamily  classes  of  Sigma  and  Delta/Epsilon. 
Additionally,  LuIoGSTl  is  grouped  in  a  subclade  with  other  Dipertan  GST  molecules 
while  LuloGST2  diverges  from  the  Dipteran  Delta/Epsilon  GST  molecules  (Figure  30). 
The  LuIoGSTl  cluster  was  generated  from  sequences  from  each  of  the  cDNA  libraries 
made  and  analyzed  while  LuloGST2  consists  of  one  sequence  from  the  sugar-fed  library 
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Table  17:  Putative  midgut-associated  oxidative  stress  molecules;  best  matched  results 
and  corresponding  E-values  from  BLAST  inquiries  of  a  GenBank-derived  non-redundant 


protein  database  and  Lutzomyia  longipalpis  EST  database 


Cluster 

Best  match  to  non-redundant  protein  database 

NR  E  value 

Best  match  to  Lutzomyia 
EST  database 

Lutzomyia  E  value 

GenBank 

221 

glutathione  s-transferase  [A.  aegypti ] 

8.E-89 

NSFM-I05el0 

9.E-I05 

EU  12461  1 

419 

glutathione  s-transferase  [A.  aegypti ] 

2.E-50 

NSFM-95g05 

2.E-I 16 

F.U  124621 

76 

GA 1 3  1 79-PA  [D.  pseudoobscura ] 

I.E-43 

NSFM-I44g07 

6.E-I03 

EU  124587 

79 

ferritin  heavy  chain-like  [G.  mors/tans] 

2.E-66 

NSFM-I46d09 

7.E- 1  1  1 

EU  124589 

781 

catalase  [A.  aegypti] 

I.E-I  19 

NSFM-  I42e04 

3.E-286 

EU  124624 

1709 

ENSANGP000000 15824  [A.  gambiae] 

I.E-50 

NSFM-39d09 

I.E-91 

EU  124625 

2557 

peroxiredoxins,  prx-l,  prx-2,  prx-3  [A.  aegypti ] 

I.E-I  05 

NSFM-34h03 

3.E-I26 

EU  124629 

Table  18:  Putative  midgut-associated  oxidative  stress  molecules;  putative  function  and 


sequence  distribution  contributed  from  each  cDNA  library 


Cluster 

Clone 

Putative  function 

SF 

BF 

Number  of  sequences 

BFi  PBMD  PBMDi 

Total 

221 

LJGFM9A02 

Glutathione  s-transferase 

1 

2 

i 

2 

1 

7 

419 

LJGFIL9G06 

Glutathione  s-transferase 

1 

0 

2 

0 

0 

3 

76 

LjGFL_P04_E07 

Ferritin  light-chain 

5 

7 

8 

3 

5 

28 

79 

LJGFiM24_C  1  1 

Ferritin  heavy-chain 

3 

6 

7 

4 

4 

24 

781 

LJGFiM2l_C02 

Catalase 

0 

0 

1 

0 

1 

2 

1709 

LJGD-LIO  H  10 

Cu/Zn  superoxide  dismutase 

0 

0 

0 

1 

0 

1 

2557 

LJGDIM2LG0I 

Peroxiredoxin 

0 

0 

0 

0 

1 

1 

Table  19:  Putative  midgut-associated  oxidative  stress  molecules;  localization,  molecular 


weight  and  isoelectric  point  of  putative  midgut  proteins 


Cluster 

Putative  function 

Gene  name 

Localization 

Molecular  weight  (kDa) 

Isoelectric  point 

221 

Glutathione  s-transferase 

LuloGST  1 

Intracellular 

23.3 

5.00 

419 

Glutathione  s-transferase 

LuloGST2 

Intracellular 

24.8 

6.41 

76 

Ferritin  light-chain 

LuloFLC 

Secreted 

24.4 

6.68 

79 

Ferritin  heavy-chain 

LuloFHC 

Secreted 

21.9 

4.92 

781 

Catalase 

LuloCat 

Intracellular 

57.7 

8.1  1 

1709 

Cu/Zn  superoxide  dismutase 

LuloSOD 

Secreted 

19.8 

5.63 

2557 

Peroxiredoxin 

LuloPRX 

Secreted 

25.0 

6.66 
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Figure  30.  Phylogenetic  analysis  of  glutathione  s-transferase  molecules. 

Sequences  analyzed  from  Lutzomyia  longipalpis  (Lulo),  Phlebotomus  papatasi 
(Pp),  Drosophila  melanogaster  (Dm),  Aedes  aegypti  (Ae),  Anopheles  gambiae  (Ag), 
Musca  domestica  (Md),  Bombyx  mori  (Bm),  Tribolium  castaneum  (Tc)  and  Blattella 
germanica  (Bg).  Accession  numbers  are  given  in  parentheses  and  the  clades  labeled  with 
the  respective  glutathione  s-transferase  class. 
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and  two  sequences  from  the  blood-fed  Leishmania- infected  cDNA  library.  Additional 
antioxidant  molecules  include  a  catalase  ( LuloCAT ),  copper-zinc  superoxide  dismutase 
( LuIoSOD )  and  peroxiredoxin  ( LuloPRX)  of  which  LuloSOD  and  LuloPRX  are  both 
predicted  to  be  secreted  based  on  the  presence  of  a  likely  signal  peptide  sequence.  ROS 
and  reactive  nitrogen  oxide  species  (RNOS)  are  important  in  host  defenses  against 
microorganisms  and  LuloCAT,  LuloSOD  and  LuloPRX  are  molecules  which  may  serve 
to  regulate  and  prevent  damage  of  the  sand  fly  midgut  by  the  ROS  and  RNOS  defenses 
similar  to  the  protective  effect  of  peroxiredoxin  in  An.  stephensi  [9]. 

Upon  the  ingestion  of  a  blood  meal  by  a  hematophagous  insect,  a  large  amount  of 
iron  and  heme  are  released  during  digestion.  To  combat  the  toxic  effects  of  free  iron  and 
the  generation  of  damaging  reactive  oxygen  species,  ferritin  is  produced  to  sequester  the 
iron  and  hemoglobin  that  is  liberated  by  the  digestion  of  red  blood  cells.  Ferritin 
molecules  are  commonly  associated  with  iron  metabolism,  and  it  is  likely  that  the 
molecules  identified  in  this  transcriptome  engage  in  metabolic  function.  However,  given 
the  relative  size  of  the  blood  meal  in  comparison  with  the  sand  fly,  ferritin  molecules 
within  the  midgut  likely  serve  a  large  role  in  preventing  the  generation  of  oxygen  radicals 
by  the  Fenton  reaction.  Two  transcripts  from  clusters  76  and  79  were  identified  with 
homology  to  ferritin  light-chain  and  ferritin  heavy-chain  molecules  and  were  named 
LuloFLC  and  LuIoFHC,  respectively  (Tables  17  and  19).  The  expression  of  LuloFLC 
and  LuIoFHC  appears  to  be  constitutive  based  on  the  number  of  sequences  generated  in 
each  cDNA  library  spanning  the  condition  of  sugar-fed,  blood-fed,  and  post-blood  meal 
digestion  (Table  18). 
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Serine  protease  inhibitors 

Two  types  of  serine  protease  inhibitors  were  identified  in  the  cDNA  libraries;  a 
single  sequence  with  homology  to  SERPIN  and  a  cluster  of  17  sequences  with  homology 
to  a  Kazal-type  serine  protease  inhibitor  (Tables  20-22).  SERPIN  molecules  within  the 
midgut  of  the  sand  fly  may  serve  to  counteract  damaging  proteases  produced  by 
microorganisms;  however,  LuloSRPN  lacks  a  predicted  signal  peptide  sequence  and  thus 
may  serve  an  intracellular  housekeeping  function.  LuloKZL,  identified  from  cluster  1 12, 
is  a  small  molecule  of  6.3  kDa  and  is  predicted  to  be  secreted.  Comparison  of  LuloKZL 
with  Kazal-type  serine  protease  inhibitors  found  in  a  transcriptome  analysis  of  the  midgut 
of  P.  papatasi  identified  PpKZLl  as  a  highly  conserved  homolog.  Kazal-type  protease 
inhibitors,  such  as  rhodniin  and  infestin  identified  in  Rhodnius  prolixus  and  Triatoma 
infestans,  respectively,  have  been  characterized  as  thrombin  inhibitors;  thereby,  these 
molecules  would  prevent  coagulation  of  ingested  blood  to  facilitate  successful  digestion 
of  the  blood  meal  [10,  11].  LuloKZL  sequences  are  more  abundant  prior  to  and  during 
blood  meal  digestion  based  on  the  number  of  sequences  in  the  sugar-fed,  blood-fed  and 
post-blood  meal  digestion  cDNA  libraries.  Additionally,  LuloKZL  was  not  identified  in 
an  EST  analysis  of  whole  sand  fly  Lu.  longipalpis  and  is  therefore  more  likely  a  midgut- 
specific  molecule  found  in  abundance  only  in  the  alimentary  tissue  [5].  Thus,  a  prudent 
hypothesis  would  be  that  LuloKZL  serves  a  similar  function,  allowing  the  blood  bolus  to 
remain  in  a  colloidal  suspension  within  the  gut  to  facilitate  peristalsis  and  digestion. 
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Table  20:  House  keeping  and  low  abundant  transcripts  from  the  midgut  of  Lutzomyia 
longipalpis ;  best  matched  results  and  corresponding  E-values  from  BLAST  inquiries  of  a 
GenBank-derived  non-redundant  protein  database  and  Lu.  longipalpis  EST  database 


Cluster 

Best  match  to  non-redundant  protein  database 

NR  E  value 

Best  match  to  Lutzomyia 
EST  database 

E  value 

GenBank 

128 

GAPDH  II  [D.  pseudoobscura ] 

I.E-168 

SFM-03d02 

2.E-I6I 

EU  1  24605 

195 

fructose-bisphosphate  aldolase  [A.  aegypti] 

I.E-138 

NSFM-99b02 

5.E-I53 

EU 1 24609 

189 

sugar  transporter  [A.  aegypti] 

0.E+00 

NSFM-46elO 

1  .E-255 

EU  1 24608 

200 

ENSANGP000O00 18531  [A.  gambiae] 

0.E+00 

SFM-05b09 

4.E-225 

EU  1 246 1 0 

292 

cytochrome  c  oxidase  subunit  iv  [A.  aegypti ] 

3.E-58 

NSFM-43al2 

6.E-97 

EU  1  246 1 8 

97 

ADP/ATP  translocase  [L  cuprina ] 

I.E-154 

NSFM-64b06 

2.E-I6I 

EU  1 24598 

69 

Vacuolar  ATP  synthase  16  kDa  proteolipid  subu  [A.  aegypti ] 

2.E-77 

NSFM-95b05 

6.E-70 

EU  1 24586 

67/192 

Actin  87E  CG  18290-PA,  isoform  A  [D.  melanogaster ] 

0.E+00 

NSFM-4  If08 

5.E-202 

EU  124585 

112 

GA  16408-PA  [D.  pseudoobscura] 

4.E-I0 

EU  1 2460  1 

2287 

serine  protease  inhibitor  4  [A.  aegypti] 

3.E-27 

NSFM-73el  1 

3.E-I85 

EU  124627 

358 

RAS,  putative  [A.  aegypti] 

I.E-90 

NSFM-I55h05 

2.E-93 

EU  124620 

2556 

ENSANGP000000 16718  [A.  gambiae] 

I.E-103 

NSFM-83c08 

4.E-99 

EU  1 24628 

500 

conserved  hypothetical  protein  [A.  aegypti] 

7.E-08 

NSFM-I54d08 

3.E-09 

EU  124623 

235 

peptidoglycan  recognition  protein  LB  [G.  morsitans] 

8.E-69 

NSFM-8lb08 

4.E-I09 

EU  1  24614 

I960 

defensin  isoform  B 1  [A.  aegypti] 

2.E-I2 

EU  1 24626 

269 

40S  ribosomal  protein  S7  ribosomal  protein  [C.  pipiens] 

3.E-89 

NSFM-I5f05 

I.E-97 

EU  1 246 1 5 

423 

ribosomal  protein  S20  [B.  mori] 

7.E-56 

NSFM-4  Ig09 

2.E-59 

EU  1 24622 

226 

ribosomal  protein  S8  [A.  albopictus] 

6.E-93 

NSFM-0  Ic05 

9.E-97 

EU  124612 

125 

LDI6326p  [D.  melanogaster] 

I.E-100 

NSFM-52a05 

2.E-84 

EU  1 24604 

304 

Ribosomal  protein  L32  CG7939-PC,  isoform  C  [D.  melanogaster] 

I.E-67 

EU  1 246 1 9 

101 

GA20389-PA  [D.  pseudoobscura] 

I.E-153 

SFM-03hl2 

2.E-I33 

EU  1 24599 

108 

60S  acidic  ribosomal  protein  PI  [S.  frugiperda] 

3.E-48 

NSFM-I63bl2 

5.E-27 

EU  1 24600 

1 19 

similar  to  Drosophila  melanogaster  CG2099  [D.  yakuba] 

2.E-54 

NSFM-  I00a07 

6.E-62 

EU  1 24603 

40 

similar  to  Neurospecific  receptor  kinase  CG4007-PA  [A. 

8.E-0 1 

NSFM-57e04 

6.E-I40 

FU 1 24578 

mellifera] 

54/55 

14.5  kDa  salivary  protein  [P.  duboscqi] 

8.E-4I 

EU  1 24580 

88 

bS  1 1  M  [A.  aegypti] 

2.E-03 

NS  FM- 1 49f  1 0 

I.E-58 

EU  1 24596 

151 

CG  1 440 1  -PA  [D.  melanogaster] 

3.E-06 

NSFM-I  I4e07 

I.E-I 1 

EU  124606 

230 

conserved  hypothetical  protein  [A.  aegypti] 

8.E-22 

EU  124613 

90 

Hypothetical  protein  C30H6.I  l[C.  elegans ] 

2.E-I0 

NSFM-55hO  1 

7.E-64 

EU  1 24597 

276 

CG32644-PB  [D.  melanogaster ] 

2.E-I  1 

NSFM-23d08 

8.E-27 

EU 1246 17 
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Table  21:  House  keeping  and  low  abundant  transcripts  from  the  midgut  of  Lutzomyia 
longipalpis ;  putative  function  and  sequence  distribution  contributed  from  each  cDNA 


library 


Cluster 

Clone 

Putative  function 

SF 

BF 

Number  of 

BFi 

sequences 

PBMD 

PBMDi 

Total 

128 

LJGDIM22.  F06 

Glyceraldehyde-3-phosphate  dehydrogenase 

1 

1 

0 

4 

7 

13 

195 

LJGDIL8_D0I 

Fructose-bisphosphate  aldolase 

0 

0 

0 

3 

3 

6 

189 

LJGFL_P04_A09 

Sugar  transporter 

2 

3 

0 

2 

1 

8 

200 

LJGD-L2G  1 1 

Enolase 

0 

1 

0 

3 

2 

6 

292 

LJGFiM27_F04 

Cytochrome  c  oxidase  IV 

1 

2 

1 

0 

1 

5 

97 

LJGDIM25  C06 

ADP/ATP  translocase 

4 

0 

2 

3 

6 

15 

69 

LJGDIL9F0 1 

V-ATPase  C-subunit 

2 

6 

4 

7 

7 

26 

67/192 

LJGFL_P0  LG02 

Actin 

10 

16 

4 

6 

2 

38 

1  12 

LJGUM-P03G07 

Kazal-type  serine  protease  inhibitor 

6 

4 

5 

2 

0 

17 

2287 

LJGFiM2l_F02 

Serine  protease  inhibitor  4 

0 

0 

1 

0 

0 

1 

358 

LJGFM5  B 1  1 

Ras 

1 

1 

2 

0 

0 

4 

2556 

LJGDIM2  l_F  1  1 

Aquaporin 

0 

0 

0 

0 

1 

1 

500 

LJGFiL8_B0l 

Galectin 

0 

0 

1 

2 

0 

3 

235 

LJGFSP02C04 

Peptidoglycan  recognition  protein 

3 

2 

0 

0 

1 

6 

I960 

LJGDM27AI0 

Defen  sin 

0 

0 

0 

1 

0 

1 

269 

LJGUM-P03F07 

40S  ribosomal  protein  S7 

1 

1 

1 

2 

1 

6 

423 

LJGDM25A04 

40S  ribosomal  protein  S20 

1 

1 

0 

1 

0 

3 

226 

LJGDiM26_AI2 

40S  ribosomal  protein  S8 

1 

2 

1 

2 

1 

7 

125 

LJGFiM25_D  10 

60s  ribosomal  protein  LI9 

5 

2 

2 

2 

2 

13 

304 

LJGF-L-8E06 

60S  ribosomal  protein  L32 

1 

2 

0 

0 

2 

5 

101 

LJGFiL3_D04 

60S  acidic  ribosomal  protein  P0 

3 

0 

4 

5 

5 

17 

108 

LJGU-m-5_A09 

60S  acidic  ribosomal  protein  PI 

6 

0 

2 

6 

4 

18 

1  19 

LJGUS_P03_AI  2 

60S  Ribosomal  protein  L35Ae 

10 

1 

1 

3 

0 

15 

40 

LJGFIL7  DI2 

Unknown 

6 

4 

13 

25 

22 

70 

54/55 

LJGF-I-  I0_A05 

Unknown 

7 

13 

13 

1  1 

4 

48 

88 

LJGU-I- 1 0_D  1  1 

Unknown 

5 

2 

2 

5 

2 

16 

151 

LJG  FiL  1  _H05 

Unknown 

0 

2 

6 

1 

2 

1  1 

230 

LJ  G  Fi  M  22  H  04 

Unknown 

0 

1 

3 

1 

2 

7 

90 

LJGU-I-7_C03 

Unknown 

5 

5 

5 

4 

5 

24 

276 

LJGFiL2_F05 

Unknown 

0 

1 

2 

3 

0 

6 
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Table  22:  House  keeping  and  low  abundant  transcripts  from  the  midgut  of  Lutzomyia 
longipalpis ;  localization,  molecular  weight  and  isoelectric  point  of  putative  midgut 
proteins 


Cluster 

Putative  function 

Gene  name 

Localization 

Molecular  weight  (kDa) 

Isoelectric  point 

128 

Glyceraldehyde-3-phosphate  dehydrogenase 

Intracellular 

35.2 

7.84 

195 

Fructose-bisphosphate  aldolase 

Intracellular 

30.8 

6.73 

189 

Sugar  transporter 

Transmembrane 

53.7 

7.03 

200 

Enolase 

Intracellular 

46.7 

6.5 

292 

Cytochrome  c  oxidase  IV 

Intracellular 

20.7 

9.26 

97 

ADP/ATP  translocase 

Transmembrane 

33.3 

9.87 

69 

V-ATPase  C-subunit 

Transmembrane 

16.0 

8.41 

67/192 

Actin 

Intracellular 

41.8 

5.29 

1  12 

Kazal-type  serine  protease  inhibitor 

LutoKZL 

Secreted 

6.3 

4.83 

2287 

Serine  protease  inhibitor  4 

LuloSRPN 

Intracellular 

42.1 

4.95 

358 

Ras 

Intracellular 

20.5 

5.20 

2556 

Aquaporin 

Transmembrane 

27.7 

8.71 

500 

Galectin 

LuloGalec 

Intracellular 

17.2 

7.33 

235 

Peptidoglycan  recognition  protein 

LuloPGRP 

Intracellular 

21.9 

6.75 

I960 

Defensin 

LuloDEF 

Secreted 

7.2 

6.89 

269 

40S  ribosomal  protein  S7 

Intracellular 

21.9 

9.82 

423 

40S  ribosomal  protein  S20 

Intracellular 

13.4 

10.44 

226 

40S  ribosomal  protein  S8 

Intracellular 

23.6 

10.72 

125 

60s  ribosomal  protein  LI9 

Intracellular 

24.0 

1  1.13 

304 

60S  ribosomal  protein  L32 

Intracellular 

16.0 

1  1.77 

101 

60S  acidic  ribosomal  protein  P0 

Intracellular 

34.2 

6.23 

108 

60S  acidic  ribosomal  protein  PI 

Intracellular 

1  1.5 

4.08 

1  19 

60S  Ribosomal  protein  L35Ae 

Intracellular 

16.8 

1  1.24 

40 

Unknown 

Secreted 

29.2 

9.59 

54/55 

Unknown 

Secreted 

14.3 

9.13 

88 

Unknown 

Secreted 

14.5 

7.78 

151 

Unknown 

Secreted 

1  1.9 

4.72 

230 

Unknown 

Secreted 

1  1.6 

9.95 

90 

Unknown 

Secreted 

19.0 

3.8 

276 

Unknown 

Secreted 

16.6 

3.41 
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Anti-bacterial  molecules 

Two  molecules,  originating  from  clusters  235  and  1960,  encode  a  putative 
peptidoglycan  recognition  protein  ( LuloPGRP )  and  defensin  ( LuloDEF ),  respectively. 
LuloPGRP  is  similar  to  other  predicted  peptidoglycan  recognition  proteins  found  in 
Glossina  morsitans  morsitans  and  mosquitoes,  while  it  is  phylogenetically  distinct  from 
Lepidopteran  molecules  (Figure  31).  This  is  the  first  report  of  a  putative  PGRP  identified 
in  sand  flies,  and  in  searching  a  midgut  transcriptome  database  of  P.  papatasi,  a  molecule 
was  identified  with  87%  identity.  LuloPGRP  may  serve  as  a  pattern  recognition  protein, 
specifically  for  the  conserved  structure  of  peptidoglycan  indicated  by  the  conservation  of 
the  amino  acid  sequence  among  insects,  as  a  component  of  the  sand  fly  immune  system 
defense  against  bacterial  pathogens  (Figure  31).  PGRP  molecules  characterized  in 
Bombyx  mori  and  Trichoplusia  ni  have  been  shown  to  be  expressed  primarily  in  the  fat 
body  and  hemocytes,  and  it  is  conceivable  that  the  identification  of  LuloPGRP  transcripts 
arose  due  to  a  contamination  of  the  tissue  sample  [12,  13],  It  is  possible  that  the  midgut 
tissue  of  sand  flies  express  a  PGRP  for  protection  against  microorganisms  ingested 
during  sugar  and  blood-feeding  as  a  PGRP  was  identified  as  preferentially  expressed  in 
the  midgut  of  Sarnia  cynthia  ricini  [14]. 

Defensins  are  another  type  of  innate  immune  defense  that  insects  possess  to  ward 
off  pathogenic  bacteria.  A  single  sequence,  named  LuloDEF,  was  identified  in  the  post¬ 
blood  meal  digestion  midgut  cDNA  library  with  homology  to  a  defensin  molecule 
characterized  in  Ae.  aegypti.  Like  other  insect  defensin  molecules,  LuloDEF  has  a 
predicted  secretion  signal  peptide,  and  most  sequence  homology  is  given  by  the  carboxyl 
half  of  the  sequence  and  conservation  of  cysteine  residues  (Figure  32).  LuloDEF  shares 
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Figure  31.  Sequence  analysis  of  peptidoglycan  recognition  proteins. 

(A)  Phylogenetic  analysis  of  amino  acid  sequences  of  peptidoglycan  recognition 
proteins  from  Lutzomyia  longipalpis  (Lulo),  Phlebotomus  papatasi  (Pp),  Anopheles 
gambiae  (Ag),  Aedes  aegvpti  (Ae),  Glossina  moristans  moristans  (Glm),  Drosophila 
melanogaster  (Dm),  Tribolium  castaneum  (Tc),  Apis  mellifera  (Am),  Bombyx  mori  (Bm), 
Galleria  mellonella  (Gam),  Trichoplusia  ni  (Tn)  and  Sarnia  cynthia  ricini  (Scr). 
Accession  numbers  are  in  parentheses  and  bootstrap  values  indicate  node  support.  (B) 
Multiple  sequence  alignment  of  peptidoglycan  recognition  proteins.  Identical  amino  acid 
residues  are  highlighted  black  and  similar  residues  are  highlighted  grey. 
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47%  identity  and  61%  similarity  with  a  defensin  characterized  in  Phlebotomus  duboscqi, 
which  is  induced  by  the  presence  of  wild  type  Leishmania  major  [15].  Both  immunity- 
associated  genes,  LuloPGRP  and  LuloDEF,  may  have  an  impact  on  the  progression  and 
result  of  a  midgut  infection  by  Leishmania  parasites,  either  directly  or  by  indirect  effects 
if  co-colonization  of  the  midgut  with  bacteria  is  an  intermediary  confounding  factor. 

Transcripts  differentially  expressed  by  blood-feeding  and  digestion 

A  comparison  between  the  sugar-fed  and  blood  and  between  the  blood-fed  and 
post-blood  meal  digestion  libraries  was  conducted  using  Pearson’s  chi-square  equation  to 
identify  overrepresented  transcripts  within  each  cluster.  As  was  previously  seen  in  P. 
papatasi  a  number  of  digestion-associated  transcripts  were  overabundant  in  the  blood-fed 
cDNA  library  [6].  We  envisioned  similar  results  in  the  analysis  of  the  Lu.  longipalpis 
midgut  cDNA  libraries  with  the  enhanced  advantage  of  a  cDNA  library  produced  from 
midguts  that  had  fully  processed  and  excreted  the  blood  meal  byproducts.  It  was  our 
hypothesis  that  the  post-blood  meal  midgut  transcript  abundance  would  be  most  similar 
to  the  sugar-fed  midgut  transcript  abundance  prior  to  a  blood  meal.  Overall,  the  number 
of  sequences  per  cluster  was  similar  in  the  sugar-fed  cDNA  library  to  those  in  the  post¬ 
blood  meal  digestion  cDNA  library  and  more  transcripts  are  overrepresented  in  the 
blood-fed  library  (Table  23).  Several  exceptions  to  both  overall  observations  do  occur, 
however.  Most  of  the  microvillar  protein  transcripts  are  abundant  in  the  blood-fed  cDNA 
library  except  for  LuloMVP3,  which  is  highly  represented  in  the  sugar-fed  and  post-blood 
meal  digestion  cDNA  libraries.  This  reinforces  the  suggestion  that  the  microvillar 
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Table  23:  Sequence  distribution  altered  during  sugar- feeding  and  blood  meal  digestion; 


clusters  overrepresented  in  the  sugar-fed,  blood-fed  and  post-blood  meal  digestion 
midgut  cDNA  libraries  as  determined  by  %  statistical  analysis 


Putative  function 

Cluster  # 

SG 

BF 

PBMD 

P  value 

Gen  Bank 

Microvillar  protein  ( LuloMVPI ) 

27 

5 

109 

0 

6.8E-03 

EU 124571 

Microvillar  protein  ( LuloMVPI ) 

29 

3 

87 

0 

3.2E-06 

F.U  1 24572 

Microvillar  protein  ( LuloMVP3 ) 

48 

15 

6 

18 

4.2E-02 

EU  1 24579 

Microvillar  protein  ( LuloMVP4 ) 

66 

1 

24 

0 

3.5E-02 

EU  1 24584 

Microvillar  protein  (Lu/oMVP5) 

36 

1 

60 

0 

3.2E-04 

EU  124577 

Trypsin  ( Utrypl ) 

35 

3 

55 

0 

4.9E-08 

ABM26904 

Trypsin  ( Lltryp2 ) 

18 

135 

6 

109 

4.2E-02 

ABM  26905 

Chymotrypsin  (LuloChym  IA) 

33 

3 

51 

1 

1. IE-17 

EU  1 24576 

Chymotrypsin  ( LuloChym! ) 

64 

0 

17 

0 

2.4E-02 

EU  124583 

Astacin-like  metal loprotease  ( LuloAstadn ) 

58 

23 

5 

0 

3.6E-I6 

FU 124581 

Unknown 

40 

6 

4 

25 

3.4E-02 

EU  1 24578 

Table  24:  Sequence  distribution  altered  during  sugar-feeding  and  blood  meal  digestion; 


clusters  that  appear  overabundant,  but  are  not  statically  significant  by  %2  analysis,  in  the 


sugar-fed,  blood-fed  and  post-blood  meal  digestion  midgut  cDNA  libraries 


Putative  function 

Cluster  # 

SG 

BF 

PBMD 

P  value 

Gen Bank 

Peritrophin  ( LuloPerl ) 

77 

0 

6 

0 

4.3E-02 

EU  124588 

Peritrophin  ( LuloPer2 ) 

114 

1 

7 

0 

2.0E-05 

FU  124602 

Chymotrypsin  ( Lulochym3 ) 

87 

1 

14 

0 

4.2E-02 

EU  124591 

Chymotrypsin  (LuloChym4) 

30 

12 

1 

1 

I.7E-03 

EU  124573 

Carboxypeptidase  ( LuloCpepAI ) 

104 

0 

14 

0 

I.4E-02 

EU  124592 

Carboxypeptidase  (LuloCpepAI) 

107 

6 

5 

0 

3.40E-02 

FU  124593 

Carboxypeptidase  ( LuloCpepB ) 

91 

1 

8 

1 

2.4E-07 

EU  1 24594 

60S  acidic  ribosomal  protein  P0 

101 

3 

0 

5 

I.80E-02 

EU  124599 

60S  acidic  ribosomal  protein  PI 

108 

6 

0 

6 

9.2E-05 

EU  124600 

60S  Ribosomal  protein  L35Ae 

1  19 

10 

1 

3 

4.2E-02 

EU  124603 
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proteins  are  likely  functionally  different  molecules  grouped  solely  on  homology  to 
previously  annotated  sequences.  In  general,  proteases  appear  to  be  induced  by  the  act  of 
blood-feeding  or  the  presence  of  a  blood  meal  within  the  midgut;  with  the  exception  of 
Lltryp2,  which  is  significantly  more  abundant  in  the  sugar-fed  and  post-blood  meal 
digestion  cDNA  libraries  and  also  LuloAstacin  that  is  more  abundant  in  the  sugar-fed 
cDNA  library  (Tables  23  and  24).  These  molecules  may  be  produced  and  stored  prior  to 
blood-feeding  for  immediate  use  in  digestion  or  perhaps  have  a  role  other  than  digestion 
altogether,  such  as  immunity.  Other  proteases  such  as  LuloChym.4  and  LuloCpepA2  are 
present  in  higher  or  near  equal  numbers  in  the  sugar-fed  library  when  compared  with  that 
of  the  blood-fed  library.  Molecules,  such  as  the  peritrophins  LuloPerl  and  LuloPer2,  are 
also  more  plentiful  in  the  blood-fed  cDNA  library,  suggesting  that  these  molecules  may 
be  transcribed  only  in  response  to  blood-feeding.  A  transcript  encoding  a  predicted 
protein  of  unknown  function  derived  from  cluster  40  was  identified  as  being  most 
abundant  in  the  post-blood  meal  digestion  cDNA  library,  signifying  it  may  play  a  role 
outside  of  blood  meal  digestion,  such  as  oogenesis. 

Transcripts  differentially  expressed  by  the  presence  of  Leishmania  infantum  chagasi 

To  evaluate  the  effects  of  the  presence  of  L.  infantum  chagasi  parasites  on  the 
transcript  abundance  in  the  midgut  tissue  of  the  sand  fly  we  compared  the  number  of 
sequences  in  each  cluster  between  the  blood-fed  and  blood-fed  Leishmania- infected 
cDNA  library  as  well  as  between  the  post-blood  meal  digestion  and  post-blood  meal 
digestion  Leishmania- infected  cDNA  library  using  Chi-square  analysis  (Tables  25-27). 
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Table  25:  Sequence  distribution  altered  by  Leishmania  infantum  chagasi;  clusters 


overrepresented  in  the  blood-fed  and  blood-fed  L.  infantum  chagasi- infected  midgut 

2 

cDNA  libraries  as  determined  by  f  statistical  analysis 


Putative  function 

Cluster  # 

BF 

BFi 

P  value 

GenBank 

Microvillar  protein  ( LuloMVPI ) 

27 

109 

55 

2.0E-04 

EU  124571 

Microvillar  protein  ( LuloMVP2 ) 

29 

87 

40 

2.0E-04 

EU  124572 

Microvillar  protein  (LuloMVP4) 

66 

24 

7 

4.9E-03 

EU  1 24584 

Microvillar  protein  (LuloMVP5) 

36 

60 

27 

I.6E-03 

EU  124577 

Peritrophin  ( LuloPerl ) 

77/78 

6 

22 

I.0E-03 

EU  124588 

Trypsin  ( Lltryp2 ) 

18 

6 

15 

2.9E-02 

ABM26905 

Chymotrypsin  ( LuloChym  IA ) 

33 

51 

22 

2.4E-03 

EU  1 24576 

Carboxypeptidase  ( LuloCpepAI ) 

104 

14 

3 

I.3E-02 

EU  124592 

Ac  tin 

67/192 

16 

4 

I.3E-02 

EU  124585 

Unknown 

40 

4 

13 

I.7E-02 

EU  124578 

Table  26:  Sequence  distribution  altered  by  Leishmania  infantum  chagasi',  clusters 
overrepresented  in  the  post-blood  meal  digestion  and  post-blood  meal  digestion  L. 
infantum  chagasi- infected  midgut  cDNA  libraries  as  detennined  by  %  statistical  analysis 


Putative 

function 

Cluster  # 

PBMD 

PBMDi 

P  value 

GenBank 

Trypsin 

(Lltryf>2) 

18 

109 

168 

2.0E-04 

ABM26905 

Table  27:  Sequence  distribution  altered  by  Leishmania  infantum  chagasi',  LuloTryp3 
appears  underrepresented  in  the  post-blood  meal  digestion  L.  infantum  chagasi-inkcted 
midgut  cDNA  library 


Putative  function 

Cluster  # 

PBMD 

PBMDi  GenBank 

Trypsin  ( LuloTryp3 ) 

83 

7 

1  EU 124590 
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We  hypothesized  that  the  effects  of  the  parasite’s  presence  in  the  blood  engorged  sand  fly 
would  likely  mirror  what  we  had  observed  in  a  similar  comparison  of  P.  papatasi 
infected  with  L.  major.  Additionally,  we  hypothesized  that  the  analysis  of  the  post-blood 
meal  digestion  midgut  tissue  would  reveal  a  large  number  of  differentially  abundant 
transcripts,  as  during  this  time  period  Leishmania  parasites  are  interacting  with  the 
midgut  epithelium,  replicating  and  differentiating  to  the  metacyclic  form.  In  accordance 
with  what  we  observed  previously  in  blood  engorged  P.  papatasi  infected  with  L.  major, 
there  was  an  under  representation  of  the  microvillar  protein  transcripts  [6],  Similar  trends 
in  abundance  between  infected  P.  papatasi  and  infected  Lu.  longipalpis  also  occur  for 
transcripts  encoding  the  putative  digestion  enzymes  trypsin  ( Lltryp2 )  and  chymotrypsin 
( LuloChymlA ).  Two  other  digestive  proteases,  LuloAstacin  and  LuloCpepAl,  were 
identified  as  differentially  abundant  in  the  presence  of  L.  infantum  chagasi  with  a 
reduction  in  the  number  of  transcripts  captured  in  the  blood-fed  Leishmania- infected 
library;  however,  only  the  LuloCpepAl  difference  was  statistically  significant .  There  is 
a  striking  contradiction  in  the  modulated  abundance  of  peritrophin  transcripts.  In  the 
midgut  of  infected  P.  papatasi,  peritrophin  transcripts  decrease;  whereas,  Lu.  longipalpis 
infected  with  L.  infantum  chagasi  has  a  significant  overrepresentation  of  peritrophin 
( LuloPerl )  and  overrepresentation  of  the  putative  chitin-binding  molecule  ( LuloChiBi ). 
There  appears  to  be  a  downregulation  of  actin  transcripts  by  the  presence  of  the  L. 
infantum  chagasi  parasites  in  the  midgut.  We  speculate  that  this  could  be  a  tactic  of  the 
parasite  to  decrease  the  cytoskeletal  rearrangement  that  occurs  after  blood-feeding  as  a 
means  of  decreasing  peristalsis,  which  may  aid  in  the  retention  of  the  parasite  within  the 
gut  of  the  sand  fly. 
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In  the  context  of  abundant  transcripts,  the  post-blood  meal  digestion  midgut 
infected  with  L.  infantum  chagasi  is  relatively  quiescent.  Only  one  transcript,  encoding  a 
putative  trypsin  molecule,  was  identified  as  significantly  different  in  abundance.  Lltryp2 
sequences  were  1.54  times  more  abundant  in  the  L.  infantum  chagasi- infected  post-blood 
meal  digestion  cDNA  library,  which  corroborates  the  observed  overrepresentation  of 
Lltryp2  sequences  in  the  blood-fed  infected  cDNA  library.  It  is  possible  that  the  increase 
in  sand  fly  Lltiyp2  occurs  due  to  the  presence  of  a  perceived  pathogen  or  as  a 
consequence  of  a  non-specific  perception  of  contents  within  the  midgut.  Conversely, 
LuloTryp3  transcripts  were  captured  at  a  lower  frequency  in  the  L.  infantum  chagasi- 
infected  midgut  after  blood  meal  digestion. 

Conclusion 

Leishmania  parasites  develop  to  a  transmissible  and  infective  form  entirely  within 
the  confines  of  the  alimentary  tract  of  the  sand  fly,  in  contrast  to  numerous  other 
arthropod-borne  pathogens.  We  wished  to  further  investigate  the  response  of  the  sand  fly 
midgut  tissues  that  are  occurring  in  reaction  to  blood  meal  ingestion  and  interactions  with 
Leishmania  parasites.  The  previously  reported  extensive  sequencing  of  whole  sand  fly 
Lutzomyia  longipalpis  ESTs  provided  a  large  overview  of  the  transcripts  present  in  this 
vector.  However,  it  did  not  provide  information  regarding  tissue  specific  transcripts, 
particularly  from  the  sand  fly  midgut  or  information  regarding  the  midgut  molecules  that 
may  be  transcribed  in  response  to  blood-feeding  and  digestion  or  interact  with  the 
Leishmania  parasite.  In  the  present  work,  the  production  of  five  different  cDNA  libraries 
generated  a  large  number  of  redundant  tissue  specific  transcripts  for  analysis  as  well  as 
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provided  the  capability  of  a  comparative  analysis  between  these  cDNA  libraries.  Several 
molecules  were  identified  in  this  midgut-specific  transcriptome  that  were  not  identified  in 
the  EST  database  of  whole  sand  fly  sequences,  including  LuloKZL  and  LuloDEF. 

The  present  analysis  of  midgut  tissue  from  Lu.  longipalpis  further  increases  our 
knowledge  of  the  molecular  events  that  occur  throughout  the  adult  life  cycle  of  the  sand 
fly.  In  general,  it  appears  that  the  midgut  reverts,  after  complete  digestion  and  excretion 
of  the  blood  meal,  to  a  state  nearly  mimicking  the  midgut  of  a  sand  fly  that  has  only  taken 
a  sugar  meal.  Comparing  data  generated  from  the  sugar-fed  and  blood-fed  sand  fly 
midguts  resulted  in  comparable  global  changes  found  in  the  similar  analysis  of  the 
midgut  of  P.  papatasi  [6].  Microvillar  proteins,  digestive  proteases  and  peritrophin 
molecules  are  some  of  the  transcripts  identified  as  differentially  represented  between 
cDNA  libraries  when  comparing  unfed  and  blood-fed  sand  flies.  Interestingly,  many 
molecules,  such  as  microvillar  proteins  and  digestive  proteases,  were  found  to  be  over-  or 
underrepresented  when  comparing  the  blood-fed  with  the  blood-fed  Leishmania- infected 
cDNA  libraries.  Similar  results  were  observed  in  the  midgut  of  P.  papatasi  when 
infected  with  L.  major.  This  not  only  demonstrates  the  reproducibility  of  this  technique 
of  analyzing  transcript  abundance  across  cDNA  libraries,  but  the  redundancy  present  in 
the  biology  of  blood-feeding  and  digestion  in  sand  flies  as  well  as  the  Leishmania-ve ctor 
interactions  occurring  between  Old  World  and  New  World  sand  fly  species.  When 
comparing  the  uninfected  and  L.  infantum  chagasi- infected  post-blood  meal  digestion 
library  we  were  astounded  by  the  scarcity  of  differentially  abundant  transcripts  when 
considering  the  number  and  volume  of  Leishmania  parasites  present  in  the  midgut  at  the 
time  points  encompassed  by  the  cDNA  library.  These  data  suggest  that  the  Leishmania 
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parasite  affects  the  midgut  expression  profile  during  the  blood  digestion  process  and  not 
afterwards.  It  is  likely  that  Leishmania  parasites  modulate  the  expression  profile  of  other 
molecules,  but  our  approach  was  not  able  to  detect  these  proteins,  probably  as  a  result  of 
their  low  abundance.  Further  testing  employing  more  direct  techniques  such  as  real-time 
PCR  or  other  expression  profile  approaches  are  still  required  to  test  the  hypothesis  that  L. 
infantum  chagasi  is  altering  the  expression  of  specific  gut  transcripts  from  the  sand  fly 
Lutzomyia  longipalpis.  However,  the  information  presented  in  the  current  work  and 
previous  work  on  P.  papatasi  and  L.  major  strongly  suggest  that  Leishmania  parasites 
can  alter  the  expression  of  midgut  transcripts  that  may  be  relevant  for  the  survival  and 
establishment  of  the  parasite  in  the  gut  of  the  fly.  Also,  these  changes  may  be  occurring 
during  the  digestion  of  the  blood  meal  and  not  afterwards. 

Methods 

Sand  flies 

Lutzomyia  longipalpis  sand  flies  (Jacobina  strain)  were  maintained  at  the 
Laboratory  of  Malaria  and  Vector  Research  at  the  National  Institute  of  Allergy  and 
Infectious  Diseases.  Three  to  four-day  post  eclosion  sand  flies  were  allowed  a  20% 
sucrose  solution  (sugar- fed/unfed)  or  fed  blood  on  anesthetized  B ALB/c  mice  (blood- 
fed). 

Leishmania  and  sand  fly  infection 

The  infection  of  Lu.  longipalpis  using  an  artificial  blood  meal  containing 
Leishmania  infantum  chagasi- infected  macrophages  (blood-fed,  infected)  was  based  on 
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the  work  of  Tesh  and  Modi  [16].  Briefly,  L.  infantum  chagasi 

MHOM/BROO/MER/Strain  2  promastigote  cultures  were  maintained  in  Ml 99  (Sigma- 
Aldrich,  St.  Louis,  MO)  containing  20%  (V\V)  fetal  bovine  serum(FBS)  (Invitrogen, 
Carlsbad,  CA)  and  100  units/ml  Penicillin,  100  pg/ml  Streptomycin  and  0.292  mg/ml 
Glutamine  (PSG)  (Invitrogen,  Carlsbad,  CA)  at  25°C.  Macrophage  cell  line  J774A.  1 
(American  Type  Culture  Collection,  Manassas,  VA)  was  cultured  in  RPMI  (Invitrogen, 
Carlsbad,  CA)  containing  10%  FBS  and  PSG  at  37.0°C,  95%  air,  5%  C02.  At 
confluency,  the  macrophages  were  scraped  from  the  culture  flask  and  washed  twice  by 
centrifugation  in  phosphate  buffered  saline  (PBS)  at  380  x  g  for  10  minutes  before 
resuspension  in  culture  media.  The  washed  macrophages  were  then  placed  in  5  wells  of  a 
24-well  culture  plate  at  a  concentration  of  2  x  106  cells/ml  and  allowed  to  adhere  for  90 
minutes  at  37°C,  5%  CCF.  Stationary-phase  L.  infantum  chagasi  culture  was  washed  by 
centrifugation  in  PBS  at  1200  x  g  for  15  minutes  and  resuspended  in  macrophage  culture 
media.  Nonadherent  macrophages  were  removed  by  the  replacement  of  the  culture  media 
and  Leishmania  parasites  added  at  a  5: 1  ratio  of  parasite  to  macrophage.  The  parasites 
were  co-cultured  with  the  macrophages  for  5  hours  at  26°C.  The  culture  was  then  washed 
to  remove  extracellular  parasites  and  the  macrophages  scraped  from  the  wells. 
Macrophages  were  confirmed  to  contain  intracellular  amastigotes  by  staining  with 
QUICK  III  (Astral  Diagnostics,  Inc.,  West  Deptford,  NJ)  according  to  the  manufacture’s 
protocol  and  visualized  by  light  microscopy.  The  infected  macrophage  culture  was 
centrifuged  at  380  x  g  for  10  minutes  and  resuspended  in  500pl  fresh  whole  mouse  blood 
collected  in  heparin.  The  blood  containing  amastigote-infected  macrophages  was  used 
for  artificial  blood-feeding  of  sand  flies  as  described  [17]. 
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cDNA  library  construction 

The  conditions  of  the  Lutzomyia  longipalpis  midguts  harvested  for  the 
construction  of  the  five  cDNA  libraries  included:  unfed/sugar-fed  four-day  post-eclosion 
female  midguts;  midguts  containing  blood  1,  2,  and  3  days  post-blood  meal  from  female 
flies  allowed  to  feed  on  BALB/c  mice;  midguts  containing  blood  1,  2,  and  3  days  post¬ 
blood  meal  from  female  flies  allowed  an  artificial  blood  meal  containing  L.  infantum 
chagasi  infected  macrophages;  midguts  devoid  of  blood  5,  6,  and  7  days  post-blood  meal 
from  gravid  flies  allowed  to  feed  on  BALB  /c  mice;  midguts  devoid  of  blood  5,  6,  and  7 
days  post-blood  meal  from  gravid  flies  allowed  an  artificial  blood  meal  containing  L. 
infantum  chagasi  infected  macrophages.  All  midguts  used  for  the  construction  of  cDNA 
libraries  containing  infected  sand  fly  midguts  were  verified  by  microscopy  as  carrying 
Leishmania  parasite  infections  comparable  to  mature  infections  seen  in  sand  flies  that  are 
used  routinely  in  transmission  experiments.  Lu.  longipalpis  midguts  were  dissected  in 
phosphate  buffered  saline  (PBS),  placed  in  RNAlater  (Sigma-Aldrich,  St.  Louis,  MO) 
and  stored  at  4°C  prior  to  cDNA  library  construction.  Libraries  constructed  using 
midguts  at  different  time  points  consisted  of  two  midguts  at  each  day  the  midguts  were 
dissected.  Lu.  longipalpis  midgut  mRNA  was  isolated  from  six  midguts  using  the 
MicroFastTrack  mRNA  isolation  kit  (Invitrogen,  San  Diego,  CA).  The  cDNA  libraries 
were  constructed  using  the  SMART  cDNA  Library  Construction  Kit  (Clontech, 

Mountain  View,  CA)  as  described  previously  [18], 
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DNA  Sequencing 

Phage  plaques  lacking  fi-galactosidase  activity  were  picked  from  the  soft  top  agar 
using  a  sterilized  wooden  stick  and  placed  into  75(0.1  of  ultrapure  water  in  a  96-well  v- 
bottom  plate.  PCR  was  used  to  amplify  the  cDNA  insert  from  3  pi  of  the  phage  in  water 
using  FastStart  PCR  Master  premixed  PCR  reagent  (Roche  Applied  Science, 

Indianapolis,  IN)  and  primers  PT2F1  ( AAGT ACTCT AGC AATTGT GAGC)  and  PT2R1 
(CTCTTCGCTATTACGCCAGCTG).  Reaction  conditions  were  75°C,  3  min;  94°C,  4 
min;  33  cycles  of  94°C,  1  min;  49°C,  1  min;  72°C  2  min;  a  final  extension  of  72°C  for  7 
minutes.  The  PCR  products  were  cleaned  of  buffering  salts,  dNTPs,  and  primers  using 
ExcelaPure  96-well  UF  PCR  purification  plates  (Edge  Biosystems,  Gaithersburg,  MD) 
using  three  washes  of  lOOpl  of  ultrapure  water  and  recovery  in  30pl  of  ultrapure  water. 
Cycle  sequencing  was  accomplished  using  BigDye  Tenninator  v3.1  (Applied 
Biosystems,  Foster  City,  CA),  primer  PT2F3  (TCTCGGGAAGCGCGCCATTGT),  and 
5  pi  of  the  cleaned  PCR  product.  The  cycle  sequencing  products  were  prepared  for 
sequencing  by  centrifugation  through  hydrated  Sephadex  G-50  (Amersham,  Piscataway, 
NJ),  desiccation,  and  rehydration  with  lOpl  sequencing  buffer.  Sequencing  was 
perfonned  using  a  3730x1  DNA  analyzer  (Applied  Biosystems,  Foster  City,  CA). 

Bioinformatics 

Detailed  reports  of  the  bioinformatic  analysis  of  the  data  were  previously  reported 
[19,  20],  Succinctly,  high  N  (unidentified  nucleotide)  content  was  removed  at  the  5’  and 
3’  ends  of  each  sequence,  as  well  as  any  primer  or  vector  nucleotides  removed. 

Sequences  from  all  five  libraries  were  combined  and  contigs  constructed  from  the 
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clustering  of  homologous  sequences  based  on  100%  identity  over  64  nucleotides,  while 
sequences  with  greater  than  5%  N’s  were  discarded.  Three  frame  translated  sequences 
were  supplied  to  the  appropriate  BLAST  algorithm  for  comparison  to  the  contents  of  the 
NCBI  non-redundant  protein  database,  the  Gene  Ontology  database  [21]  and  the 
conserved  domain  database  [22],  which  contains  the  eukaryotic  clusters  of  orthologous 
groups  (COG),  Simple  Modular  Architecture  Tool  (SMART)  and  Protein  Family 
Database  (Pfam)  [23,  24].  Customized  databases  of  mitochondrial  and  ribosomal  RNA 
nucleotide  sequences  also  were  used  for  the  comparison  of  cDNA  sequences.  The 
predicted  presence  of  a  signal  secretion  peptide  or  transmembrane  helices  was 
determined  using  the  SignalP  [25]or  TMHMM  server  [26],  respectively.  A  custom 
program,  Count  Libraries,  was  used  to  identify  the  number  of  transcripts  that  each  library 
contributed  to  the  fonnation  of  a  contig  (JMC  Ribeiro).  The  contigs,  information  for 
each  contig,  the  BLAST  and  SignalP  results  were  combined  in  a  hyperlinked  Excel 
spreadsheet  and  each  contig  annotated  by  manually  assigning  the  most  likely  predicted 
function  based  on  BLAST  results.  Sequences  were  aligned  using  Clustal  X,  version  1.83, 
and  converted  to  graphically  aligned  sequences  using  BioEdit,  version  7. 0.5. 3  [27]. 
Phylogenetic  analysis  was  conducted  on  amino  acid  alignments  using  TREE-PUZZLE, 
version  5.2,  generating  trees  by  maximum  likelihood  using  quartet  puzzling  with  10,000 
puzzling  steps  to  calculate  node  support  [28].  Statistical  significance  in  the  number  of 
transcripts  per  cluster  within  that  same  cluster,  between  cDNA  libraries,  was  analyzed 
using  Pearson’s  Chi-square  test. 
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Chapter  5 

Temporal  profiling  of  transcripts  in  the  midgut  tissue  of  the  sand  fly  Lutzomyia 
longipalpis :  effects  of  blood-feeding  and  Leishmania  infantum  chagasi  infection 


160 


161 


Abstract 

The  midgut  of  a  sand  fly  is  a  key  organ  for  Leishmania  parasites  successful 
proliferation  and  differentiation  into  an  infective  and  transmissible  form,  but  little  is 
known  about  the  molecular  interactions  that  occur  between  Leishmania  infantum  chagasi 
and  the  principal  vector,  Lutzomyia  longipalpis.  A  comparative  analysis  of  five  cDNA 
libraries  identified  transcripts  that  may  be  differentially  modulated  at  certain  time  points 
after  blood  meal  ingestion  in  the  presence  or  absence  of  L.  infantum  chagasi.  Several 
transcripts,  which  were  found  in  higher  numbers  in  midgut  cDNA  libraries  from  sugar- 
fed,  blood-fed,  or  L.  infantum  chagasi- infected  sand  flies,  were  chosen  for  multiplexed 
quantitative  RT-PCR  analysis.  Temporal  expression  profiles  of  transcripts,  such  as  those 
encoding  putative  chymotrypsin,  carboxypeptidase,  astacin-like  serine  protease, 
peritrophin-like  and  microvillar  proteins,  showed  strong  modulation  by  the  ingestion  of  a 
blood  meal.  Actin  and  a  trypsin  molecule  showed  very  subtle  changes  in  transcript 
abundance  in  the  presence  of  L.  infantum  chagasi.  Colonization  of  Lu.  longipalpis  by  L. 
infantum  chagasi  induced  changes  both  during  blood  digestion  and  at  later  time  points 
associated  with  metacyclogenesis  and  transmission.  Furthennore,  transcript  profile 
analysis  allowed  the  comparison  between  blood-fed  and  L.  infantum  chagasi  infected 
sand  flies  as  the  blood  meal  is  digested  and  the  parasite  proliferates  within  the  midgut. 
Temporal  profiling  of  specific  transcripts  provides  insights  into  the  processes  of  blood 
meal  digestion  as  well  as  demonstrating  the  effect  L.  infantum  chagasi  has  on  the 
transcripts  of  Lu.  longipalpis  midgut  tissue. 
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Background 

The  leishmaniases  are  world- wide  tropical  diseases  infecting  an  estimated  12 
million  people  with  over  350  million  people  at  risk.  Visceral  leishmaniasis  is  one  of 
several  manifestations  of  disease  caused  by  the  protozoan  parasite  Leishmania.  Visceral 
leishmaniasis  is  a  fatal  disease  if  untreated,  and  the  traditional  treatment,  administration 
of  the  toxic  drug  class  of  pentavalent  antimonials,  commonly  causes  adverse  effects  and 
sometimes  life-threatening  conditions  such  as  cardiac  arrhythmia  and  acute  pancreatitis. 
There  is  one  licensed  vaccine  for  canine  leishmaniasis,  which  may  function  as  an 
immunotherapeutic  and  transmission  blocking  vaccine  as  well;  however,  no  vaccine  is 
available  for  use  in  humans  [1,2].  Current  pursuits  for  a  vaccine  against  visceral 
leishmaniasis  are  focusing  on  Leishmania- host  interactions,  sand  fly-host  interactions  and 
sand  fly  -Leishmania  interactions.  Vaccines  based  on  Leishmania-  host  interactions, 
vaccination  using  parasite  proteins,  are  the  largest  of  the  three  aforementioned  efforts  and 
have  produced  a  vast  amount  of  information.  Exploration  of  the  sand  fly-host  interaction 
as  a  means  of  vaccination,  namely,  the  use  of  sand  fly  saliva  to  elicit  a  protective  immune 
response,  has  been  very  productive.  Comparably,  the  workforce  researching  sand  fly- 
Leishmania  interactions  as  potential  vaccine  targets  attempting  to  block  transmission  are 
smaller  and  have  focused  on  Leishmania- derived  molecules.  There  is  a  fundamental 
need  to  fill  this  knowledge  gap  for  both  scientific  and  medical  importance. 

The  sand  fly  Lutzomyia  longipalpis  is  a  vector  of  the  protozoan  agent  known  to 
cause  visceral  leishmaniasis,  Leishmania  infantum  chagasi.  Infection  of  the  sand  fly 
occurs  in  females  following  the  ingestion  of  an  infected  blood  meal.  The  Leishmania 
parasite  then  develops  solely  within  the  midgut  through  several  stages  and  defeats  or 
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avoids  life  cycle  barriers  to  ultimately  propagate  and  mature  into  the  infective  metacyclic 
promastigote.  A  transmission  blocking  vaccine  exploits  the  necessity  for  specific 
molecular  interactions  that  occur  during  the  development  of  a  transmissible  infection. 
Unlike  arboviruses  or  Plasmodium  that  migrate  and  infect  the  salivary  glands  of  their 
respective  vectors,  the  Leishmania  parasite  develops  within  the  confines  of  the  alimentary 
canal  of  the  sand  fly.  Through  the  sequencing  of  midgut  tissue-specific  cDNA  libraries 
of  the  sand  fly  Phlebotomus  papatasi,  the  receptor  necessary  for  Leishmania  major 
parasite  binding,  PpGalec,  was  identified  and  shown  to  be  required  for  complete 
development  of  a  transmissible  infection  [3,  4], 

Sequencing  cDNA  libraries  of  the  midgut  of  Lutzomyia  longipalpis  identified 
numerous  molecules  that  likely  interact  with  and  impact  the  life  cycle  of  the  pathogen 
Leishmania  infantum  chagasi  [5].  Differential  comparison  of  the  transcript  abundance  in 
cDNA  libraries  constructed  from  midgut  tissues  under  conditions  of  sugar-fed,  blood-fed, 
post-blood  meal  digestion  and  in  the  presence  or  absence  of  L.  infantum  chagasi 
identified  molecules  potentially  altered  by  each  of  these  conditions.  Elucidation  of  the 
molecular  events  occurring  during  blood-feeding  and  digestion,  as  well  as  the  interactions 
between  the  sand  fly  midgut  tissue  and  Leishmania  parasites,  warrants  further  research. 

In  this  chapter,  we  provide  an  in  depth  temporal  quantitation  of  ten  transcripts  identified 
in  the  midgut  transcriptome  and  reveal  the  impact  on  transcription  by  colonization  of  L. 
infantum  chagasi  within  the  midgut. 
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Results  and  discussion 

Previously,  differential  comparison  and  sequencing  of  midgut  cDNA  libraries  of 
the  sand  fly  Lutzomyia  longipalpis  found  a  number  of  transcripts  were  over-  or  under¬ 
expressed  when  Leishmania  infantum  chagasi  was  present  within  the  midgut  [5],  Ten 
transcripts  were  chosen,  based  on  their  potential  importance  in  L.  infantum  chagasi 
infection  of  the  sand  fly  and  sequence  abundance  in  the  cDNA  libraries,  for  further 
investigation  by  quantitative  multiplex  reverse  transcrip tase-PCR  (qm  RT-PCR)  (Table 
28).  The  transcripts  analyzed  consist  of  proteins  categorized  as  proteases,  peritrophins, 
and  microvillar  proteins  as  well  as  actin.  The  proteases  are  likely  integral  to  blood  meal 
digestion  and  have  potential  importance  in  the  establishment  of  successful  Leishmania 
colonization.  Peritrophic  matrix  fonnation  begins  immediately  following  blood  ingestion 
so  as  to  envelope  the  blood  bolus.  While  the  peritrophic  matrix  is  important  in  blood 
meal  digestion,  it  serves  as  a  barrier  to  Leishmania  parasites  that  must  cross  the 
peritrophic  matrix  and  bind  the  midgut  epithelium  to  complete  the  life  cycle  within  the 
sand  fly.  Microvillar  proteins,  identified  as  some  of  the  most  abundant  midgut 
transcripts,  are  functionally  uncharacterized  proteins  that  are  over-  or  under-expressed  in 
the  different  conditions  evaluated  by  cDNA  library  sequencing.  Table  29  provides  the 
gene  name  and  putative  translational  product,  NCB1  accession  number,  PCR  product  size 
without  the  additional  nucleotides  provided  by  the  multiplex  universal  primers,  primer 
sequences  and  the  concentration  of  primer  used  in  the  multiplex  reaction.  The  cDNA 
libraries  generating  the  previously  reported  transcript  abundances  are  derived  from  the 
pooling  of  midgut  tissue  from  the  general  conditions  of  blood-fed  (visible  blood  within 
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Table  28:  Transcript  abundance  in  the  cDNA  libraries  of  Lutzomyia  longipalpis 
sugar-fed  (SF),  blood-fed  (BF),  blood-fed  L.  infantum  chagasi- infected  (BFi),  post-blood 
meal  digestion  (PBMD),  post-blood  meal  digestion  L.  infantum  chagasi- infected 
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the  midgut)  and  post-blood  meal  digestion  (no  visible  traces  of  blood  in  gravid  sand  fly 
midgut).  The  following  is  a  comparative  and  temporal  profiling  of  transcripts  in  the 
midgut  tissue  of  sugar-fed  and  6  hours,  daily  from  1-7  days  and  13  days  post-blood  meal 
ingestion  in  artificially  blood-fed  and  L.  infantum  chagasi  infected  blood-fed  female  Lu. 
longipalpis  sand  flies. 

L.  infantum  chagasi  is  most  abundant  prior  to  the  defecation  of  digested  blood. 

Leishmania  infection  was  verified  and  quantified  using  qm  RT-PCR  to  correlate 
the  presence  of  parasite  with  alterations  in  midgut  transcript  abundance.  L.  infantum 
chagasi  replicates  to  its  highest  number  two  days  after  the  infectious  blood  meal  is  taken, 
as  measured  by  the  presence  of  Leishmania  alpha-tubulin  (TUA)  transcripts  (Figure  33). 
Parasite  numbers,  or  TUA  transcripts,  were  below  the  detectable  limit  six  hours  post¬ 
blood  meal  (PBM)  and  are  first  detected  ~24  hours  PBM.  TUA  abundance  peaks  at  day 
2  PBM,  corresponding  with  rapid  parasite  replication  within  the  blood  meal.  The  TUA 
transcripts  then  decrease  at  day  3,  likely  due  to  parasite  death  and  a  decrease  in  TUA 
expression.  Excretion  of  the  parasites  in  the  feces  of  the  sand  fly  likely  contributes  to  the 
considerable  decrease  in  TUA  transcripts;  however,  most  of  the  blood  meal  byproducts 
were  excreted  between  days  3  and  4.  There  is  then  a  slight  and  steady  increase  in  TUA 
expression  as  the  parasites  continue  to  replicate  in  the  midgut. 

Microvillar  protein  transcripts  are  affected  by  the  presence  of  L.  infantum  chagasi. 

The  two  microvillar  proteins  selected  for  analysis  demonstrate  contrasting 
expression  patterns  in  the  cDNA  libraries  with  MVP3  being  most  abundant  in  the  absence 
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Leishmania  infantum  chagasi  abundance  (qm  RT-PCR) 


Time  post  blood  meal 


Figure  33.  Temporal  profile  of  L.  infantum  chagasi  infection  in  the  midgut  of  Lu. 
longipalpis. 

L.  infantum  chagasi  alpha-tubulin  relative  transcript  abundance,  nonnalized  to  Lu. 
longipalpis  S7  rRNA,  after  artificial  blood-feeding  of  L.  infantum  chagasi- infected 
macrophages.  Error  bars  represent  standard  deviation. 
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of  a  blood  meal  and  MVP 4  found  almost  solely  in  the  presence  of  blood  within  the 
midgut  (Table  28).  MVP 3  is  more  abundant  in  sugar-fed  sand  flies  than  during  the  first 
three  days  succeeding  blood  meal  ingestion  and  MVP3  increases  with  peak  abundance 
reached  at  five  days  after  blood-feeding  and  returns  to  a  nearly  equivalent  level  to  that 
found  in  the  sugar- fed  fly  by  13  days  PBM  (Figure  34A).  The  impact  of  Leishmania  on 
MVP3  transcript  abundance  in  the  cDNA  libraries  was  immeasurable,  but  there  appears 
to  be  a  significant  increase  during  blood  digestion,  followed  by  a  nearly  two-fold 
decrease  five  days  PBM.  It  appears  that  the  presence  of  L.  infantum  chagasi  parasites 
within  the  midgut  advances  the  expression  of  MVP3.  We  also  investigated  MVP4 
abundance.  As  early  as  six  hours  after  blood-feeding,  MVP4  is  dramatically  induced  and 
reaches  peak  levels  two  days  PBM  and  is  then  maintained  at  very  low  amounts  13  days 
after  feeding  (Figure  3 5 A).  There  was  a  slight  impact  on  MVP4  expression  in  the 
infected  sand  fly,  with  a  detectable  increase  at  one  and  five  days  PBM  and  a  decrease  at 
three  days.  To  reaffirm  the  impact  of  Leishmania  on  transcript  abundance  as  measured 
by  quantitative  multiplex  RT-PCR  the  blood-fed  and  L.  infantum  chagasi- infected 
samples  were  analyzed  by  real-time  RT-PCR  (rt  RT-PCR).  Overall,  the  results  of  the  rt 
RT-PCR  show  the  same  relative  abundance  of  transcripts  at  all  time  points  assayed  by  qm 
RT-PCR  (Figures  34B  and  35B).  The  inherent  sensitivity  of  rt  RT-PCR  revealed 
stronger  impacts  on  MVP 3  and  MVP 4  expression.  Expanding  on  the  dichotomy  of 
microvillar  protein  expression,  it  is  more  evident  that  MVP3  expression  is  induced  by  L. 
infantum  chagasi  during  blood  meal  digestion,  while  MVP4  is  induced  after  the  digestion 
and  excretion  of  blood  meal  products.  Annotated  as  microvillar  proteins  due  to  the 
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Figure  34.  Comparative  abundance  of  microvillar  protein  MVP  3  in  uninfected  and  L. 
infantum  chagasi- infected  Lu.  longipalpis  midguts. 

Transcript  abundance  measured  by  (A)  quantitative  multiplex  RT-PCR  and  (B) 
real-time  RT-PCR.  Bars  indicate  MVP3  relative  transcript  abundance,  normalized  to  Lu. 
longipalpis  S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped 
bars)  sand  flies.  Diamonds  indicate  the  MVP3  expression  ratio  of  infected  to  uninfected 
sand  flies.  Error  bars  represent  95%  confidence  intervals. 


Relative  abundance  Relative  abundance 
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MVP 3  expression  ( qm  RT-PCR) 


MVP 3  expression  (rt  RT-PCR) 


infected: uninfected  exression  ratio  (♦)  infected: uninfected  exression  ratio  (♦) 
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Figure  35.  Comparative  abundance  of  microvillar  protein  MVP4  in  uninfected  and  L. 
infantum  chagasi- infected  Lu.  longipalpis  midguts. 

Transcript  abundance  measured  by  (A)  quantitative  multiplex  RT-PCR  and  (B) 
real-time  RT-PCR.  Bars  indicate  MVP4  relative  transcript  abundance,  normalized  to  Lu. 
longipalpis  S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped 
bars)  sand  flies.  Diamonds  indicate  the  MVP4  expression  ratio  of  infected  to  uninfected 
sand  flies.  Error  bars  represent  95%  confidence  intervals. 


174 


homology  with  Blattella  germanica  and  Periplaneta  americana  microvilli  proteins,  sand 
fly  microvillar  proteins  contain  a  number  of  conserved  insect  allergen  repeats  [6,  7].  The 
function  of  microvillar  proteins  is  currently  unknown;  however,  the  impact  of  blood¬ 
feeding  and  the  presence  of  L.  infantum  chagasi  within  the  midgut  on  MVP  expression 
levels  warrant  further  research  on  the  importance  of  these  molecules. 

L.  infantum  chagasi  accelerates  and  upregulates  peritrophin  transcription. 

Two  putative  peritrophin  molecules  were  assayed  for  temporal  abundance,  Perl 
and  Per 2.  One  and  two  days  after  blood  ingestion  Perl  was  abundant,  and  at  all  other 
time  points  analyzed,  the  transcript  was  below  the  detectable  limit  of  the  quantitative 
multiplex  RT-PCR  (Figure  36A).  In  response  to  the  presence  of  L.  infantum  chagasi, 
Perl  transcript  increases  1.7  or  3.6  fold  one  day  PBM  as  measured  by  multiplex  or  rt  RT- 
PCR,  respectively,  and  is  below  levels  normally  found  in  the  uninfected  Lu.  longipalpis 
two  days  after  blood-feeding.  Due  to  the  sensitivity  of  rt  RT-PCR,  the  minute  amounts 
of  Perl  detected  from  day  4  onward  after  blood- feeding  showed  an  average  of  1.8  fold 
increase  in  transcription  when  infected  with  L.  infantum  chagasi ;  whereas,  the  multiplex 
was  unable  to  detect  the  transcript  (Figure  36).  An  additional  molecule  identified  as  a 
putative  peritrophin,  Per2,  also  demonstrates  an  induction  by  blood  ingestion  and 
decreases  after  blood  digestion  (Figure  37).  Per2  is  found  earlier  in  the  midgut  tissue 
than  Perl,  reaching  peak  levels  at  both  six  hours  and  two  days  after  feeding  with  low 
basal  levels  maintained  at  times  when  the  midgut  contains  no  blood.  There  is  a  subtle 
decrease  in  Per 2  transcript  abundance  in  blood-fed  infected  sand  flies  at  both  two  and 
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Figure  36.  Comparative  abundance  of  peritrophin  Perl  in  uninfected  and  L.  infantum 
chagasi- infected  Lu.  longipalpis  midguts. 

Transcript  abundance  measured  by  (A)  quantitative  multiplex  RT-PCR  and  (B) 
real-time  RT-PCR.  Bars  indicate  Perl  relative  transcript  abundance,  nonnalized  to  Lu. 
longipalpis  S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped 
bars)  sand  flies.  Diamonds  indicate  the  Perl  expression  ratio  of  infected  to  uninfected 
sand  flies.  Error  bars  represent  95%  confidence  intervals. 
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Figure  37.  Comparative  abundance  of  peritrophin  Per2  in  uninfected  and  L.  infantum 
chagasi- infected  Lu.  longipalpis  midguts. 

Transcript  abundance  measured  by  (A)  quantitative  multiplex  RT-PCR  and  (B) 
real-time  RT-PCR.  Bars  indicate  Per2  relative  transcript  abundance,  nonnalized  to  Lu. 
longipalpis  S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped 
bars)  sand  flies.  Diamonds  indicate  the  Per2  expression  ratio  of  infected  to  uninfected 
sand  flies.  Error  bars  represent  95%  confidence  intervals. 
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three  days  PBM,  and  then  a  significant  increase  is  notable  after  blood  digestion, 
particularly  at  Days  5  and  7  (Figure  37).  Perl  contains  four  possible  chitin-binding 
domains,  presumably  integral  to  the  cross-linking  of  the  chitin  fibrils  of  the  peritrophic 
matrix.  It  is  clear  that  Perl  is  highly  regulated  with  response  to  both  induction  and 
repression  in  response  to  blood-feeding  and  digestion,  which  is  to  be  expected  of  a 
molecule  involved  in  the  temporary  existence  of  a  specialized  foundation,  the  peritrophic 
matrix.  Per2  is  unique  in  that  it  contains  only  one  potential  chitin-binding  domain  and 
thus  cannot  act  in  the  cross  linking  of  chitin  fibrils.  It  is  possible  that  Per2  and  other 
single  domain  chitin-binding  molecules  may  act  to  cap  the  end  of  chitin  fibrils,  sequester 
free  chitinous  molecules,  or  regulate  fungal  pathogens  as  innate  immune  molecules 
within  the  midgut.  It  appears  that  the  presence  of  L.  infantum  chagasi  promotes  the 
earlier  fonnation  of  the  peritrophic  matrix  in  an  effort  to  restrict  the  locality  of  the 
parasite  within  the  blood  bolus.  Prior  research  indicates  that  the  peritrophic  matrix  plays 
a  vital  role  in  protecting  Leishmania  parasites  from  midgut  proteases  as  they  differentiate 
from  amastigotes  to  promastigotes  [8].  Coincidently,  the  parasites  must  also  escape  the 
peritrophic  membrane  prior  to  the  passing  of  the  blood  meal  for  the  continuation  of 
infection  [8].  A  hypothesis  explaining  the  early  formation  and  subsequent  early 
dissociation  of  the  peritrophic  matrix  is  that  Leishmania  requires  quick  shelter  of 
transitional-stage  parasites  from  proteases  and  then  escape  from  the  peritrophic  matrix  to 
facilitate  attachment  to  the  midgut  epithelium. 
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Figure  38.  Comparative  abundance  of  carboxypeptidase  CpepAl  in  uninfected  and  L. 
infantum  chagasi- infected  Lu.  longipalpis  midguts. 

Bars  indicate  CpepAl  relative  transcript  abundance,  nonnalized  to  Lu.  longipalpis 
S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand 
flies.  Diamonds  indicate  the  CpepAl  expression  ratio  of  infected  to  uninfected  sand  flies. 
Error  bars  represent  95%  confidence  intervals. 


infected:uninfected  exression  ratio  (♦) 
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Figure  39.  Comparative  abundance  of  carboxypeptidase  CpepB  in  uninfected  and  L. 
infantum  chagasi- infected  Lu.  longipalpis  midguts. 

Bars  indicate  CpepB  relative  transcript  abundance,  nonnalized  to  Lu.  longipalpis 
S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand 
flies.  Diamonds  indicate  the  CpepB  expression  ratio  of  infected  to  uninfected  sand  flies. 
Error  bars  represent  95%  confidence  intervals. 
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Carboxypeptidases  are  increased  in  response  to  L.  infantum  chagasi  colonization. 

We  recently  described  several  carboxypeptidases  in  the  transcriptome  of  the  Lu. 
longipalpis  midgut  in  blood-fed  sand  flies  [5].  This  temporal  analysis  of  CpepAl  and 
CpepB  carboxypeptidase  transcripts  confirmed  the  correlation  of  Cpep  expression  with 
blood-feeding  as  observed  in  the  cDNA  library  comparative  analysis.  Both 
carboxypeptidases  are  scarce  in  sugar-fed  sand  flies  but  are  induced  as  early  as  six  hours 
PBM  and  peak  in  abundance  two  days  after  a  blood  meal  is  taken  (Figures  38  and  39). 
There  is  a  striking  reduction  in  Cpep  transcripts  three  days  PBM  and  expression  remains 
at  a  low  basal  level,  likely  until  a  subsequent  blood  meal  is  taken.  The  similarities  in  the 
expression  profdes  of  these  carboxypeptidases  are  observed  in  the  L.  infantum  chagasi 
infected  sand  flies  as  well.  There  is  an  increase  in  transcript  abundance  between  six 
hours  and  one  day  PBM  with  expression  peaking  at  day  two.  The  presence  of 
Leishmania  parasites  within  the  midgut  appears  to  initiate  an  increased  amount  of  these 
digestive  proteases  early  in  blood  digestion.  The  decrease  in  transcripts  two  days  PBM 
could  be  an  effort  of  the  parasite  to  regulate  the  amount  of  proteases  present  within  the 
midgut  lumen  as  a  protective  measure.  However,  it  is  also  possible  that  the  parasite 
induces  increased  transcript  production  early  in  blood  digestion  and  the  increased 
transcript  or  protein  product  acts  as  a  negative  feedback  on  translation  of  these 
carboxypeptidases.  Strikingly,  late  in  L.  infantum  chagasi  colonization  of  the  sand  fly, 
there  is  a  significant  increase  in  CpepAl  transcripts  -  this  can  be  seen  at  seven  and  13 
days  post  infection.  While  increases  in  digestive  proteases  are  likely  a  result  of  the 
presence  of  parasite  protein  in  the  lumen  of  the  midgut  it  is  possible  that  protease 
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Figure  40.  Comparative  abundance  of  chymotrypsin,  Chym3,  in  uninfected  and  L. 
infantum  chagasi- infected  Lu.  longipalpis  midguts. 

Bars  indicate  Chym3  relative  transcript  abundance,  normalized  to  Lu.  longipalpis 
S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand 
flies.  Diamonds  indicate  the  Chym3  expression  ratio  of  infected  to  uninfected  sand  flies. 
Error  bars  represent  95%  confidence  intervals. 


infected: uninfected  exression  ratio  (♦) 
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induction  is  a  host  defense  mechanism  of  the  sand  fly.  Subtle  increases  in  CpepB 
expression  also  are  seen  late  in  L.  infantum  chagasi  colonization. 

Late-stage  L.  infantum  chagasi  infection  increases  chymotrypsin  abundance. 

A  chymotrypsin  analyzed,  Chym3,  represents  another  blood-feeding-induced 
digestive  protease.  There  is  little  impact  by  L.  infantum  chagasi  on  Chym3  expression 
immediately  following  feeding  but  a  two-fold  decrease  is  observed  two  days  PBM  with  a 
subsequent  increase  in  Chym3  expression  reaching  8-  and  17-fold  increases  in  transcript 
abundance  five  and  seven  days  after  infection,  respectively  (Figure  40).  Thirteen  days 
after  ingesting  L.  infantum  chagasi  the  sand  fly  is  producing  3.4  times  the  normal  amount 
of  Chym3.  While  Chym3  is  most  abundant  in  the  same  time  frame  as  the 
carboxypeptidases  there  is  a  variation  in  the  profile  of  expression;  most  abundant  six 
hours  after  feeding  and  decreasing  until  there  is  a  drop  to  low  basal  level  three  days  PBM 
ingestion.  The  variation  in  expression  profiles  may  be  due  to  separate  transcriptional 
regulation  such  that  carboxypeptidases  are  induced  by  the  presence  of  blood  within  the 
midgut  while  Chym3  is  induced  by  the  act  of  feeding  or  probing.  If  the  increase  in 
Chym3  transcripts  correlates  with  a  large  increase  in  chymotrypsin  activity  in  the  midgut 
of  sand  flies  with  mature  transmissible  L.  infantum  chagasi,  then  there  is  the  possibility 
of  midgut  enzymes  exerting  some  effect  on  disease  and  transmission  once  the  parasite  is 
regurgitated  into  the  host  tissue. 
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Figure  41.  Comparative  abundance  of  astacin  in  uninfected  and  L.  infantum  chagasi- 
infected  Lu.  longipalpis  midguts. 

Bars  indicate  astacin  relative  transcript  abundance,  normalized  to  Lu.  longipalpis 
S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand 
flies.  Diamonds  indicate  the  astacin  expression  ratio  of  infected  to  uninfected  sand  flies. 
Error  bars  represent  95%  confidence  intervals. 


infected:uninfected  exression  ratio  (♦) 


186 


[ffi|  Sugar-fed 
|  |  Blood-fed 


Figure  42.  Comparative  abundance  of  trypsin  TrypS  in  uninfected  and  L.  infantum 
chagasi- infected  Lu.  longipalpis  midguts. 

Bars  indicate  Tryp3  relative  transcript  abundance,  nonnalized  to  Lu.  longipalpis 
S7  rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand 
flies.  Diamonds  indicate  the  Ttyp3  expression  ratio  of  infected  to  uninfected  sand  flies. 
Error  bars  represent  95%  confidence  intervals. 


infected:uninfected  exression  ratio  (♦) 
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Astacin  transcripts  increase  but  trypsin  is  relatively  unaffected  by  L.  infantum 
chagasi. 

The  expression  profiles  of  two  other  digestive  proteases,  Astacin  and  Tryp3,  were 
also  investigated  and  differed  in  several  ways  to  the  previously  described  proteases. 
Astacin  is  abundant  in  sugar-fed  sand  flies,  correlating  with  sequence  abundance  in  the 
transcriptome,  and  shows  a  decrease  as  the  blood  meal  is  digested,  reaching  the  lowest 
levels  detected  at  day  three  (Table  28  and  Figure  41).  After  the  blood  meal  is  passed 
Astacin  begins  increasing  and  by  13  days  after  the  last  blood  meal  the  transcripts  are 
almost  as  abundant  as  found  in  sugar-fed  sand  flies.  Leishmania  infantum  chagasi 
appears  to  stimulate  an  overall  induction  of  Astacin  ,  with  two-fold  increases  seen  for 
several  days  in  a  row.  Dipteran  astacin  molecules  are  most  likely  digestive  enzymes,  and 
there  have  been  no  previous  reports  of  astacin  proteases  acting  as  antimicrobials; 
however,  this  expression  profiling  convincingly  demonstrates  an  affect  by  L.  infantum 
chagasi  on  Astacin  abundance.  Tryp3  shares  homology  with  molecules  in  a  class  of 
insect  trypsins  that  has  been  referred  to  previously  as  constitutive  or  early  trypsins; 
having  high  transcript  abundance  prior  to  blood-feeding  and  then  reducing  as  the  blood 
meal  is  digested  [9],  Blood-feeding  causes  a  reduction  in  transcript  abundance  during  the 
early  stages  of  digestion  6-24  hours  after  feeding  (Figure  42).  Surprisingly,  in  uninfected 
flies,  transcript  abundance  increases  at  two  days  and  has  two  peaks  of  high  abundance  at 
Days  3  and  7  PBM,  with  the  highest  Tryp3  abundance  13  days  after  feeding.  There  is 
very  little  effect  on  Tryp3  transcripts  in  L.  infantum  chagasi  infected  sand  flies  (Figure 
42).  While  statistically  significant  differences  occur  at  Days  3,  6  and  13,  the  fold  change 
in  expression  may  result  in  a  negligible  change  in  protein  expression.  The  induction  of 
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Figure  43.  Comparative  abundance  of  actin  in  uninfected  and  L.  infantum  chagasi- 
infected  Lu.  longipalpis  midguts. 

Bars  indicate  actin  relative  transcript  abundance,  normalized  to  Lu.  longipalpis  S7 
rRNA,  in  sugar-fed,  blood-fed  (white  bars)  or  blood-fed  infected  (striped  bars)  sand  flies. 
Diamonds  indicate  the  actin  expression  ratio  of  infected  to  uninfected  sand  flies.  Error 
bars  represent  95%  confidence  intervals. 


infected:uninfected  exression  ratio  (♦) 
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Tryp3  as  early  as  two  days  after  blood-feeding  is  not  seen  in  the  transcriptional  analysis 
of  trypsin  in  the  midgut  of  Aedes  aegvpti  or  P.  papatasi,  and  additionally,  this  early 
induction  is  not  observed  in  the  two  characterized  Lu.  longipalpis  trypsins,  Lltiypl  and 
Lltryp2  [9-11].  Previous  studies  investigating  the  temporal  expression  levels  of  Dipteran 
trypsins  utilized  semi-quantitative  PCR;  whereas,  the  quantitative  multiplex  used  here 
detected  oscillations  in  the  expression  of  what  might  be  considered  a  “constitutive” 
trypsin  [9-11]. 

Blood  meal  digestion  induces  actin  transcription. 

Actin  transcript  abundance  was  assessed  as  a  result  of  the  significant  reduction  in 
sequences  captured  in  the  L.  infantum  chagasi- infected  midgut  cDNA  libraries  (Table 
28).  Surprisingly,  a  statistically  significant  increase  in  transcript  abundance  is  found  one, 
five,  and  seven  days  after  blood-  feeding  when  comparing  uninfected  and  L.  infantum 
chagasi- infected  sand  fly  midguts  when  using  qm  RT-PCR  (Figure  43).  However,  the 
ratio  of  actin  transcript  abundance  in  the  infected  compared  with  uninfected  midguts  is 
minute  (Figure  43).  The  discrepancy  between  cDNA  library  sequence  abundance  and 
transcript  quantity  measured  by  RT-PCR  demonstrates  the  importance  in  evaluating  the 
expression  profile  of  each  transcript  identified  as  differentially  expressed  by  the 
sequencing  of  individual  cDNA  libraries.  Actin  transcripts  increase  after  blood-feeding, 
peaking  at  two  days  after  ingestion,  then  drop  to  the  lowest  measured  level  threes  days 
post-blood  meal.  Once  the  blood  meal  is  excreted,  actin  transcription  appears  to  be 
maintained  at  a  steady  basal  level.  Blood-induced  transcription  of  actin  has  been  noted  in 
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the  mosquito  Anopheles  gambiae,  and  this  induction  correlates  with  cytoskeletal  and  cell 
morphological  changes  in  the  midgut  epithelium  [12]. 

Conclusion 

The  establishment  and  propagation  of  a  transmissible  Leishmania  infection  within 
the  sand  fly  vector  occurs  within  the  lumen  of  the  midgut.  In  comparison,  Plasmodium 
ookinetes  transit  through  the  midgut  epithelium  of  mosquitoes  one  day  after  blood¬ 
feeding.  Previous  work  identified  gene  induction  during  Plasmodium  invasion  of 
epithelial  cells  of  mosquito  midgut  tissue  [13].  Leishmania  parasites  can  persist  in  the 
midgut  of  a  competent  sand  fly  host  for  the  remaining  life  span  of  the  vector.  Although 
Leishmania  do  not  actively  invade  midgut  epithelial  cells,  they  must  adhere  to  the  cell 
surface  to  prevent  being  excreted  with  the  digested  blood  meal.  During  Leishmania 
adhesion  to  the  midgut  epithelium,  there  is  likely  responsive  cellular  signaling  occurring. 
The  abundance  of  parasites  and  extracellular  Leishmania  molecules  within  in  the  lumen 
also  impacts  the  type  and  quantity  of  molecules  transcribed  by  the  midgut  tissue.  We 
wished  to  further  understand  the  impact  of  L.  infantum  chagasi  colonization  on  Lu. 
longipalpis  sand  flies  by  evaluating  a  number  of  transcripts  important  in  peritrophic 
matrix  formation  and  digestion. 

Artificial  infection  of  the  sand  flies  was  accomplished  using  a  mouse  macrophage 
cell  line  infected  with  L.  infantum  chagasi  amastigotes  in  an  attempt  to  replicate  the 
natural  presentation  of  the  parasite  to  the  midgut  environment.  This  method  resulted  in 
all  of  the  dissected  flies  colonized  with  an  abundance  of  metacyclic  promastigotes  13 
days  after  blood-feeding;  equating  to  a  transmissible  infection  as  observed  in  other 
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studies.  This  work  vindicates  the  modulation  of  Lu.  longipalpis  midgut  transcript 
abundance  of  specific  molecules  by  L.  infantum  chagasi  as  was  first  reported  in  the 
differential  comparison  of  cDNA  library  sequence  abundance  [5].  More  specifically,  the 
presence  of  L.  infan  tum  chagasi  within  the  midgut  of  Lu.  longipalpis  caused  notable 
induction  or  repression  of  transcripts  necessary  for  peritrophic  matrix  formation  and 
blood  meal  digestion.  The  manipulation  of  the  sand  fly  midgut  transcriptome  by 
Leishmania  parasites  may  serve  to  benefit  the  parasites  survival  and  necessary 
differentiation  to  the  final,  infective  metacyclic  fonn.  Additionally,  the  presence  of  the 
parasite  within  the  midgut  of  the  sand  fly  may  trigger  protective  responses,  such  as 
increased  protease  expression,  eluding  that  the  Leishmania  colonization  of  the  sand  fly  is 
neither  mutualistic  nor  commensal.  There  are  significant  increases  in  protease 
transcription  at  time  points  associated  with  transmission.  We  hypothesize  that 
upregulated  proteases  could  be  present  in  the  regurgitated  parasite  milieu  and  attribute  to 
enhanced  transmission.  Included  in  this  work  is  the  first  reported  molecular-based 
quantitation  of  Leishmania  parasites  as  they  develop  within  the  midgut  of  the  sand  fly. 

Methods 

Sand  flies 

Lutzomyia  longipalpis  sand  flies  (Jacobina  strain)  were  maintained  at  the 
Laboratory  of  Malaria  and  Vector  Research  at  the  National  Institute  of  Allergy  and 
Infectious  Diseases.  Three-  to  four-day  post  eclosion  sand  flies  were  allowed  a  20% 
sucrose  solution  (sugar- fed/unfed)  or  fed  on  the  blood  of  anesthetized  B ALB/c  mice 


(blood-fed). 
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Leishmania  and  sand  fly  infection 

Lu.  longipalpis  sand  flies  were  infected  by  an  artificial  blood  meal  containing 
Leishmania  infantum  chagasi- infected  macrophages  (blood-fed,  infected)  as  reported 
previously  with  minor  modifications  [5].  In  brief,  L.  infantum  chagasi 
MHOM/BROO/MER/Strain  2  promastigote  cultures  were  maintained  in  Ml 99  (Sigma- 
Aldrich,  St.  Louis,  MO)  containing  20%  (V\V)  fetal  bovine  serum(FBS)  (Invitrogen, 
Carlsbad,  CA)  and  100  units/ml  Penicillin,  100  pg/ml  Streptomycin,  and  0.292  mg/ml 
Glutamine  (PSG)  (Invitrogen,  Carlsbad,  CA)  at  25°C.  Macrophage  cell  line  J774A.  1 
(American  Type  Culture  Collection,  Manassas,  VA)  was  cultured  in  RPMI  (Invitrogen, 
Carlsbad,  CA)  containing  10%  FBS  and  PSG  at  37.0°C,  95%  air,  5%  C02.  At 
confluency,  the  macrophages  were  scraped  from  the  culture  flask  and  washed  twice  by 
centrifugation  in  phosphate  buffered  saline  (PBS)  at  380  x  g  for  10  minutes  before 
resuspension  in  culture  media.  The  washed  macrophages  were  then  placed  in  2  wells  of  a 
24-well  culture  plate,  2  x  106  cells  per  well,  and  allowed  to  adhere  for  1  hour  at  37°C,  5% 
CO2.  Stationary-phase  L.  infantum  chagasi  culture  was  washed  by  centrifugation  in  PBS 
at  1200  x  g  for  15  minutes  and  resuspended  in  macrophage  culture  media.  Nonadherent 
macrophages  were  removed  by  the  replacement  of  the  culture  media  and  Leishmania 
parasites  in  macrophage  added  at  a  5: 1  ratio  of  parasite  to  macrophage.  The  parasites 
were  co-cultured  with  the  macrophages  for  5  hours  at  26°C.  The  culture  was  then  washed 
to  remove  extracellular  parasites  and  the  macrophages  scraped  from  the  wells. 
Macrophages  were  confirmed  to  contain  intracellular  amastigotes  by  staining  with 
QUICK  III  (Astral  Diagnostics,  Inc.,  West  Deptford,  NJ)  according  to  the  manufacture’s 
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protocol  and  visualized  by  light  microscopy.  The  infected  macrophage  culture  was 
centrifuged  at  380  x  g  for  10  minutes  and  resuspended  in  500pl  fresh  whole  mouse  blood 
collected  in  heparin.  The  blood  containing  amastigote-infected  macrophages  was  used 
for  artificial  blood-feeding  of  sand  flies  as  described  [14]. 

RNA  extraction 

Midgut  tissue  from  female  La.  longipalpis  was  dissected  at  6  hours  and  1,  2,  3,  4, 
5,  6,  7,  and  13  days  after  artificial  blood-feeding  with  or  without  the  presence  of  L. 
infantum  chagasi- infected  macrophages.  For  each  time  point,  ten  midguts  were  placed  in 
20  pi  of  RNAlater  (Sigma- Aldrich,  St.  Louis,  MO)  and  stored  at  4°C  prior  to  RNA 
extraction.  Total  RNA  was  extracted  using  Agencourt  RNAdvance  Tissue  RNA 
extraction  kit  (Beckman,  Beverly,  MA)  according  to  the  provided  instructions.  To 
prepare  the  tissues  for  RNA  extraction,  200  pi  of  tissue  homogenization  buffer  was  added 
and  the  tissues  homogenized  using  a  Kontes  pestle  (Fischer,  Itasca,  IL)  and  another  200 
pi  of  homogenization  buffer  was  added  prior  to  incubation.  The  optional  DNase 
procedure  was  performed  according  to  instructions  using  RNase-free  DNase  (New 
England  Biolabs,  Ipswich,  MA).  RNA  was  washed  from  the  magnetic  beads  using 
molecular  grade  water  and  samples  stored  at  -70  °C  prior  to  dilution  and  experimental 
use. 

Multiplex  RT-PCR 

Primers  and  genes  used  in  the  multiplex  reaction  are  described  in  Table  29.  Each 
forward  and  reverse  primer  was  designed  using  GeXP  Express  Profiler,  Primer  Design 
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software  (Beckman,  Fullerton,  CA)  to  produce  PCR  products  with  lengths  from  100  to 
300  bp  in  length  and  4  to  7  bp  apart.  Additionally,  the  primers  used  in  the  multiplex 
reaction  are  chimeric  with  universal  primer  sequences  of  18  and  19  nucleotides 
incorporated  onto  the  forward  and  reverse  primer,  respectively.  Transcript  expression 
profiles  were  assessed  using  the  GenomeLab  GeXP  Analysis  System  Multiplex  RT-PCR 
assay  (Beckman).  The  GenomeLab  GeXP  Start  Kit  (Beckman)  was  used  for  the  reverse 
transcription  reaction,  3  pi  DNase/RNase  free  FLO,  4  pi  RT  buffer,  5pl  Kanr  RNA  with 
RI,  1  pi  reverse  transcriptase,  5  pi  sample  RNA  (2ng/pl),  and  2  pi  reverse  primer  plex 
(concentration  noted  in  Table  29).  For  RT-minus  and  no-template  control  reactions 
DNase/RNase  free  FLO  was  used  as  a  substitute.  The  reverse  transcription  reactions 
were  incubated  at  48°C  for  1  minute,  37°C  for  5  minutes,  42°C  for  60  minutes,  95°C  for 
5  minutes  and  stored  at  4°C.  The  PCR  amplification  of  each  sample  included  4  pi  PCR 
buffer  (contains  the  universal  primers),  4  pi  25  mM  MgCL  (Abgene,  Rockford,  IL),  2  pi 
forward  primer  plex  (200  nM  of  each  primer),  0.7  pi  DNA  polymerase  (Abgene)  and  9.3 
pi  cDNA  from  the  RT-PCR  reaction.  The  PCR  samples  were  incubated  at  95°C  for  10 
minutes  and  35  cycles  of  94°C  for  30  seconds,  55°C  for  30  seconds  and  68°C  for  1 
minute  and  then  stored  at  4°C.  Sample  PCR  products  were  pre-diluted  1 :20  using  10  mM 
Tris-HCl,  pH  8.0,  while  no-template  and  RT-minus  reactions  were  not  pre-diluted.  All 
samples  were  then  diluted  by  the  addition  of  1  pi  sample  and  0.5  pi  DNA  Size  Standard- 
400  into  38.5  pi  Sample  Loading  Solution,  mixed  by  pipetting,  and  covered  by  1  drop  of 
mineral  oil.  The  samples  were  placed  in  the  GenomeLab  GeXP  Genetic  Analysis  System 
for  capillary  electrophoresis  and  fragment  size  analysis.  Fragment  results  were  analyzed 
using  the  eXpress  Analysis  software  of  the  GeXP  Genetic  Analysis  System. 
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Real-time  RT-PCR 

Primer  sequences  used  in  real-time  RT-PCR  analysis  are  identical  to  those  used  in 
the  multiplex  quantitative  RT-PCR  and  are  shown  in  Table  29.  cDNA  synthesis  was 
accomplished  using  Transcriptor  First  Strand  cDNA  Synthesis  Kit  (Roche,  Indianapolis, 
IN)  and  the  included  anchored-oligo(dT)i8  primer  according  to  the  manufacturer’s 
instructions  using  12  pi  sample  RNA  (2ng/pl).  PCR  was  performed  using  Lightcyclcr' 
480  SYBR  Green  I  Master  (Roche)  according  to  the  manufacturer’s  protocol  and  using  4 
pi  of  cDNA.  Real-time  quantitation  of  transcript  abundance  was  performed  using  the 
Lightcycler®  480  Instrument  under  the  following  conditions:  95°C  for  5  minutes,  45 
cycles  of  95°C  for  10  seconds,  55°C  for  10  seconds,  72°C  for  20  seconds.  Fluorescence 
was  acquired  after  amplification  at  the  end  of  each  cycle. 

Data  analysis 

The  Kanr  RNA  served  as  an  internal  control  for  all  multiplex  reactions,  and  the 
Lu.  longipalpis  40S  ribosomal  protein  S7  housekeeping  gene  served  as  a  control  for 
nonnalization.  Transcript  expression  levels  analyzed  by  multiplex  RT-PCR  were 
nonnalized  by  dividing  the  peak  area  result  of  each  gene  by  the  peak  area  result  of  the 
housekeeping  gene  and  then  log2-transformed.  Relative  abundance  of  transcripts 
quantitated  using  real-time  PCR  were  analyzed  by  the  log2-back-transformation  of  the  Ct 
ratio  of  the  sample  amplicon  to  the  control  amplicon  and  reported  as  the  fold  increase  in 
sample  transcript  over  that  of  the  control  transcript.  Three  multiplex  assays  or  real-time 
reactions  were  completed  for  each  experiment  and  significant  differences  were  identified 
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between  L.  infantum  chagasi- infected  and  non-infected  Lu.  longipalpis  midgut  tissue 

samples  using  Student’s  t-test  and  reporting  p-values  below  0.05  as  significant. 
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Overview 

The  objective  of  this  dissertation  was  to  further  understand  the  relationship 
between  the  sand  fly  vector  and  the  host  as  well  as  the  relationship  between  the  sand  fly 
and  the  Leishmania  parasite.  As  much  of  research  is  moving  well  into  the  “post-genomic 
era,”  the  field  of  sand  fly  biology  is  anxiously  awaiting  the  completion  of  genome 
sequencing.  In  order  to  fonnulate  and  test  important  biological  hypotheses,  in  the 
absence  of  genome  data,  we  utilized  functional  transcriptomic  technologies.  Making  use 
of  this  cost  effective,  productive  and  efficient  method  of  high-throughput  cDNA  library 
sequencing,  functional  transcriptomic s  provides  nearly  the  equivalent  information  of 
genomics.  Even  after  the  release  of  an  organism’s  genome,  there  is  little  useful  data 
without  further  annotation  of  introns  and  exons,  a  task  that  often  relies  on  proteomics  and 
transcrip tomics.  By  embracing  functional  transcrip tomics,  we  are  able  to  formulate 
hypotheses  and  answer  essential  research  questions  regarding  disease  vectors  in  a  field 
that  currently  remains  in  the  “pre-genomic  era.”  Accomplishing  this  research  has  had  a 
significant  beneficial  impact  on  the  current  knowledge  of  sand  fly  biology  and  vector- 
pathogen  interactions  while  advancing  the  frontier  of  functional  transcriptomic s.  This 
work  effectively  characterized  a  unique  salivary  enzyme  in  Phlebotomus  duboscqi, 
catalogued  and  analyzed  a  large  set  of  midgut-specific  transcripts  from  Phlebotomus 
papatasi  and  Lutzomyia  longipalpis  and  additionally  demonstrated  the  impact  of 
Leishmania  infantum  chagasi  and  L.  major  on  the  temporal  profile  of  sand  fly  midguts 
transcripts. 


201 


Phlebotomus  duboscqi  adenosine  deaminase 
Prior  to  our  research,  it  was  commonly  accepted  that  adenosine  deaminase 
(ADA),  described  in  Lu.  longipalpis,  was  only  a  component  of  the  New  World  Lutzomyia 


sand  flies.  This  concept  was  nullified  with  the  discovery  of  two  transcripts  in  the  Old 
World  sand  fly  P.  duboscqi  that  likely  encode  an  enzyme  previously  only  considered  a 
New  World  salivary  molecule.  To  rule  out  the  possibility  that  the  transcripts  were 
evolutionary  relics  encoding  an  inactive  enzyme,  we  demonstrated  enzymatic  activity  of 
ADA  in  the  saliva.  Additionally,  purified  recombinant  proteins  produced  from  the 
transcript  sequences  demonstrated  ADA  activity.  The  saliva  of  P.  duboscqi  does  not 
contain  any  adenosine  or  adenosine-monophosphate  (AMP),  which  are  abundant 
molecules  in  the  saliva  of  other  Phlebotomus  sand  flies.  The  copious  amount  of  salivary 
adenosine  and  AMP  in  P.  papatasi  functions  as  vasodilators;  whereas,  the  enzyme 
maxadilan  acts  as  a  vasodilator  in  Lu.  longipalpis.  For  that  reason,  there  is  likely  a  novel 
vasodilator  molecule  in  the  saliva  of  P.  duboscqi  that  may  be  of  importance  in  disease 
transmission,  saliva-based  vaccine  research  and  potential  uses  in  phannacology.  It  is 
unclear  what  role  ADA  plays  in  blood-feeding.  ADA  has  an  important  role  in  immunity 
as  a  result  of  the  effects  of  adenosine,  2’-deoxyadenosine  and  the  hydrolytic  product  of 
these  compounds  [1].  Further  work  will  be  necessary  to  determine  the  effect  of  this 
enzyme  in  parasite  transmission  and  disease  progression.  It  may  be  possible  that  this 
enzyme  changes  the  environment  in  the  skin  where  the  Leishmania  parasite  is  deposited 
by  the  sand  fly.  Inosine  is  the  primary  metabolite  of  adenosine  by  ADA  and  has  been 
shown  to  inhibit  the  production  of  pro  inflammatory  cytokines  including  TNF-a,  IL-1,  IL- 
12,  MIPl-a  and  INFy  in  stimulated  macrophages  and  spleen  cells  [2],  Additionally, 
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adenosine  can  stimulate  mast-cell  degranulation,  causing  the  release  of  histamine  and 
serotonin  [3].  The  presence  of  ADA  in  the  saliva  of  P.  duboscqi  may  reduce  the  potential 
degranulation  of  mast-cells  and  the  generation  of  an  unfavorable  feeding  site  due  to  the 
potent  and  immediate  effects  of  histamine  and  serotonin.  Another  hypothesis,  regarding 
the  function  of  ADA  in  sand  fly  saliva,  is  that  the  enzyme  plays  an  intermediary  role  in 
the  degradation  of  adenosine  to  uric  acid.  If  ingested  by  the  sand  fly,  uric  acid  would  act 
as  an  extremely  effective  peroxynitrate  scavenger,  protecting  the  midgut  tissue  from  the 
toxic  effects  of  peroxynitrate  [4].  Ultimately,  it  may  be  possible  that  inosine  generation 
by  ADA  activity  favors  a  Th2  environment  that  will  benefit  parasite  establishment  in  the 
skin  of  the  mammalian  host.  The  identification  of  these  ADA  transcripts  in  P.  duboscqi 
demonstrates  the  independent  acquisition  of  distinctive  blood-feeding  strategies.  It  is 
now  apparent  that  the  careful  analysis  of  vector  salivary  components  needs  to  be 
considered  on  a  species  specific  level  with  due  respect  to  the  evolutionary  force  of  host 
hemostasis  on  hematophagous  arthropods.  Foremost,  the  identification  and 
characterization  of  ADA  in  P.  duboscqi  demonstrates  how  functional  transcriptomics  can 
be  used  to  gain  knowledge  about  the  biological  functioning  of  an  organism.  Therefore, 
future  work  may  focus  on  the  effect  of  ADA  on  parasite  transmission  and/or  disease 
progression. 


Functional  transcriptomics  of  the  sandfly  midgut 
We  constructed  and  sequenced  several  cDNA  libraries  generated  from  the  midgut 
tissue  of  female  P.  papatasi  and  Lu.  longipalpis .  Analyzing  cDNA  libraries  created  from 
different  conditions  in  the  sand  fly  midgut  allowed  for  the  comparison  between  sugar-fed, 
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blood-fed,  and  blood-fed  L.  major- infected  P.  papatasi.  Using  a  similar  approach  with 
five  Lu.  Longipalpis  midgut  cDNA  libraries,  we  were  able  to  compare  sugar-fed,  blood- 
fed,  and  post-blood  meal  digestion  conditions  in  the  absence  or  presence  of  L.  infantum 
chagasi.  We  then  utilized  this  infonnation  to  reproducibly  identify  the  up-  or 
downregulation  of  novel  midgut  proteins  in  each  sand  fly  vector.  These  include 
microvillar  proteins,  peritrophins,  trypsin,  chymotrypsin  and  several  uncharacterized 
proteins.  These  studies  are  also  the  first  reports  of  transcript  modulation  by  the  presence 
of  Leishmania  parasites  within  the  midgut  of  sand  flies.  It  is  likely  that  Leishmania 
parasites  modulate  the  expression  profile  of  other  molecules,  but  our  approach  was  not 
able  to  detect  these  proteins,  probably  as  a  result  of  their  low  abundance.  Use  of  a 
normalized  or  cDNA  subtraction  library  would  yield  far  more  information  about  low 
abundance  transcripts,  but  would  not  allow  the  analysis  of  transcriptional  modulation 
occurring  under  the  conditions  assessed  by  our  research.  The  microvillar  proteins 
(MVPs)  are  among  the  more  interesting  molecules  characterized.  The  function  of  MVPs 
is  unknown;  however,  based  on  the  divergent  amino  acid  sequences  of  these  proteins  they 
may  have  very  different  roles.  In  both  P.  papatasi  and  Lu.  longipalpis,  the  presence  of  L. 
major  or  L.  infantum  chagasi,  respectively,  resulted  in  significant  changes  in  blood- 
induced  MVP  sequences  captured  by  the  cDNA  library.  Future  work  will  include  the 
expression  of  Lu.  longipalpis  MVPs  and  the  generation  of  polyclonal  antibodies.  The 
recombinant  proteins  and  antibodies  could  then  be  used  to  investigate  the  effect  of  MVPs 
on  L.  infantum  chagasi  as  well  as  the  effect  of  antibody  on  blood  digestion  and  sand  fly 
longevity.  Additionally,  we  would  like  to  study  the  cellular  localization  of  MVPs  within 
the  midgut  tissue. 


204 


Another  interesting  class  of  molecules  identified  by  functional  transcriptomics  is 
the  peritrophins.  Leishmania  colonization  induced  changes  in  transcript  abundance  of 
peritrophin  proteins  in  both  P.  papatasi  and  Lu.  longipalpis.  However,  PpPerl  was 
downregulated  while  LuloPerl  was  induced  when  the  sand  flies  were  correspondingly 
infected  with  L.  major  or  L.  infantum  chagasi.  Future  work  will  likely  include  the 
expression  of  peritrophin  molecules  of  Lu.  longipalpis  to  demonstrate  chitin-binding 
activity.  Additionally,  it  would  be  intriguing  to  see  if  peritrophin-specific  antibody  could 
abrogate  peritrophic  matrix  formation  and  if  so,  what  effects  this  could  have  on  the 
development  of  L.  infantum  chagasi. 

The  submitted  sequences  in  NCBI’s  GenBank  in  March  of  2007  contained  eight 
midgut-specific  sequences  from  P.  papatasi  and  one  from  Lu.  longipalpis.  Subsequently, 
we  have  submitted  65  and  59  high  quality  sequences,  primarily  full  length  nucleotide  and 
protein  sequences,  to  GenBank  from  the  midgut  transcriptomes  of  P.  papatasi  and  Lu. 
longipalpis,  respectively.  The  sequence  data  submitted  to  GenBank  from  this  research 
comprises  over  half  of  the  total  current  available  annotated  nucleotide  and  protein 
sequence  data  for/5,  papatasi  and  Lu.  longipalpis.  Additionally,  4,439  and  9,533  ESTs 
were  submitted  to  NCBI’s  EST  database  for  P.  papatasi  and  Lu.  longipalpis, 
respectively.  The  transcriptomic  approach  presents  a  broader  view  and  thus  provides  a 
more  complete  picture  of  an  organism’s  functioning.  In  fact,  the  functional 
transcriptomic  approach  taken  here  offers  both  a  global  vantage  (systematics)  of  the 
midgut  transcriptome  and  the  very  exhaustive  detailing  (reductionism)  of  individual 
sequence  elements  of  the  “-ome.”  The  research  presented  in  this  dissertation  addresses 
both  systematic  and  reductionist  viewpoints;  (1)  the  processes  of  blood-feeding  and 
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digestion  and  the  impact  of  Leishmania  colonization  of  the  sand  fly  midgut  and  (2)  the 
comparison  of  amino  acid  sequences  from  the  sequenced  transcripts  to  confer  putative 
function  and  contrast  evolutionary  dissemination. 

Temporal  profiling  of  Lutzomyia  longipalpis  midgut  transcripts 
Functional  transcriptomics  can  provide  an  overwhelming  amount  of  infonnation. 
Using  a  Chi-square  analysis  to  compare  sequence  abundance  between  cDNA  libraries,  we 
added  order  to  the  database  in  the  context  of  blood  digestion  and  Leishmania  impact  on 
sand  fly  midgut  transcripts.  Identifying  molecules  that  have  an  altered  expression  due  to 
the  presence  of  Leishmania  as  measured  by  this  methodology  was  an  excellent 
preliminary  finding.  We  selected  sequences  identified  in  the  Lu.  longipalpis  cDNA 
libraries  that  were  modulated  by  blood- feeding  and  parasite  infection  and  reaffirmed  this 
transcript  alteration  with  greater  sensitivity.  This  task  was  performed  using  a  recently 
developed  multiplex  reverse  transcriptase  PCR  system,  analyzing  midgut  RNA  that  was 
collected  at  daily  intervals  to  generate  an  informative  temporal  transcript  profile.  Our 
findings  confirm  that  blood  feeding  and  L.  infantum  chagasi  manipulates  a  number  of 
transcripts  including  carboxypeptidases,  microvillar  proteins,  and  peritrophins.  The 
manipulation  of  the  sand  fly  midgut  transcriptome  by  Leishmania  parasites  may  serve  to 
benefit  the  parasites  survival  and  necessary  differentiation  to  the  final,  infective 
metacyclic  form.  Additionally,  the  presence  of  the  parasite  within  the  midgut  of  the  sand 
fly  may  trigger  protective  responses,  such  as  increased  protease  expression,  eluding  that 
the  Leishmania  colonization  of  the  sand  fly  is  neither  mutualistic  nor  commensal.  With 
the  data  provided  by  the  cDNA  libraries  and  the  temporal  profiling  techniques  used  here, 
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another  avenue  of  sand  fly -Leishmania  interactions  can  be  researched.  Our  future 
endeavors  include  the  temporal  profiling  of  P.  papatasi  transcripts  using  normal  blood- 
fed  sand  flies  and  L.  major- infected  sand  flies.  Moreover,  we  will  analyze  the  impact  of 
non-natural  sand  fly-parasite  pairings  (Lit.  longipalpis  infected  with  L.  major  and  P. 
papatasi  infected  with  L.  infantum  chagasi )  in  order  to  better  understand  molecular 
determinants  of  vector  competence. 

In  addition  to  the  aforementioned  benefits  of  this  research  and  the  future  goals, 
the  identification  of  molecules  necessary  for  successful  colonization  of  the  sand  fly  by 
Leishmania  has  important  public  health  implications.  A  molecule  that  is  vital  for  the 
generation  of  transmissible  parasites,  by  allowing  the  full  development  of  the  parasite 
within  the  sand  fly,  is  a  potential  transmission-blocking  vaccine.  A  transmission¬ 
blocking  vaccine  may  be  of  limited  use  to  control  transmission  of  parasites  acquired  from 
zoonotic  reservoirs,  such  as  rodents.  However,  in  endemic  areas  in  which  canines  are  the 
principal  reservoir  for  L.  infantum  chagasi,  it  is  very  likely  that  a  canine -targeted 
transmission-blocking  vaccine  would  greatly  reduce  the  potential  spread  of  a  fatal 
parasitic  disease. 
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